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Ancient DNA traces linguistic links that tie Anatolia to Europe pp. 908, 922, 939, 940, & 982 
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will feature the theme, 
an exploration of light at the molecular level, in addition to the keynote address by Dr. Peter Hotez 
and virtual poster sessions welcoming young scientists from around the globe. 


The conference will be held in the Grand Ballroom of 
conveniently located minutes from The Galleria, Memorial Park, and dozens of restaurants and 
retail offerings. 


Light interacts with molecules in myriad ways producing the colors we see in the world around us, 

skin damage from sunshine, chemical transformations resulting in new molecules and more. The light 
produced by fluorescent molecules (like those found in day-glow socks) can be sculpted by optics which 
allows scientists to detect nanoscale properties of individual molecules such as the precise position of 
the molecule in three dimensions, how it is oriented with respect to other molecules, and the lifetime of 


its excited state. The conference explores the latest developments that exploit the use of sculpted light to 
develop novel technologies to explore the world at the molecular level. 


W. E. MOERNER 


Conference Chair, 
Stanford University epandautacaeie ae iuitn anieae suenaa fat ea coat onnaee Licemeaiaa panand waninne tacawont oe es EE a AC a 


New in 2022, the conference will feature a keynote address by Dr. Peter Hotez 
of Texas Children’s Hospital and Baylor College of Medicine. 


The Texas Children’s Hospital Center for Vaccine Development is leading the development of new 
vaccines for poverty-related neglected diseases and COVID-19. Since May 2021, an estimated 200,000 
unvaccinated Americans have died because they refused COVID-19 vaccines. They were victims of 
growing anti science aggression, which has now become a dominant and lethal social force and disease 
determinant in the U.S. 


PETER HOTEZ, 
M.D., PH.D. 

Dean, National School of 
Tropical Medicine, Baylor 
College of Medicine 


The 2022 Welch Award in Chemistry Lecture 


Cell surface glycans constitute a rich biomolecular dataset that drives both normal and pathological 
processes. Their “readers” are glycan-binding receptors that can engage in cell-cell interactions and cell 
signaling. Our research focuses on mechanistic studies of glycan/receptor biology and applications of 
this knowledge to new therapeutic strategies. Our recent efforts center on pathogenic glycans in the 
tumor microenvironment and new therapeutic modalities based on the concept of targeted degradation. 


CAROLYN R. 
BERTOZZI 


Anne T. and Robert 
M. Bass Professor 
of Chemistry, 
Stanford University 
The Welch Foundation wants to hear about your research! For the first time, the Welch Conference will 


feature a virtual poster session inviting young scientists to present their research, answer questions, and 
engage with audiences from all over the world. 


For registration, live-stream information, and a complete program, visit our website at » 
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EDITORIAL 


The FDA and scientific priorities 


arlier this year, when I was confirmed as the new 
commissioner of the US Food and Drug Admin- 
istration (FDA), the world faced ongoing public 
health issues related to the pandemic and war in 
Ukraine, among other challenges. Most notably, 
the US is experiencing a flattening or decline in 
life expectancy compared with other high-income 
countries. As part of a wider effort to reverse this decline, 
relationships between FDA and the biomedical ecosystem 
should be reimagined to facilitate more effective transla- 
tion of science into successful health interventions. 

The biomedical community’s response to COVID-19 has 
provided multiple tools—vaccines; antiviral medications 
and other therapeutics; diagnostic testing—that can help 
prevent infection and transmission and treat patients. 
FDA’ flexibility and guidance regarding compressed time 
frames for research, development, testing, and regulatory 
approval were crucial in respond- 
ing to this ever-changing public 
health threat. Equally important 
was the ability to streamline clini- 
cal trials to efficiently produce data 
that enabled a clear understanding 
of the risks and benefits for new or 
repurposed therapeutics. FDA will 
apply these lessons, where relevant, 
to other areas of biomedical product 
development. A reciprocal emphasis 
on reinventing research translation 
across biomedical sectors is needed 
to meet the moment. 

Despite progress in treating can- 
cer and rare diseases, declines in US life expectancy are 
driven by common, chronic diseases (CCDs), including 
cardiometabolic, lung, and kidney disease, along with 
mental health conditions. Use of tobacco products and 
increases in deaths due to opioid overdose are also driv- 
ing negative national statistics. Existing approaches to 
develop and assess pharmacologic therapies, medical de- 
vices, and interventions for CCDs and addiction, includ- 
ing behavioral techniques, should be reexamined by the 
biomedical community. While the FDA revitalizes its ap- 
proaches, the biomedical community should also review 
its priorities so that it can deliver more new therapies in 
these areas, particularly for those suffering most: racial 
and ethnic minorities, people with less education and 
wealth, and those living in rural areas. Consortia com- 
prising patients, researchers, regulators, and the medical 
products industry are needed, as exemplified by substan- 
tial progress in therapies for cystic fibrosis, type 1 diabe- 
tes mellitus, and multiple myeloma. FDA will focus on 


“,.Felationships 
between FDA and 


the biomedical 
ecosystem should 
be reimagined...” 


translating new science into treatments and diagnostics, 
working closely with the National Institutes of Health and 
its grantees, the recently formed Advanced Research Proj- 
ects Agency for Health, and patient advocacy groups. 

Integrating research and clinical care with access to 
digital information presents enormous potential to bene- 
fit the research enterprise and health outcomes. However, 
societal norms for research participation and data shar- 
ing should be revamped, including attention to cyber- 
attack vulnerability. It is hoped that modernizing FDA’s 
digital infrastructure will prompt complementary efforts 
by the biomedical community to evolve a national digital 
infrastructure that enables swift, systematic gathering of 
patient data, collectively yielding detailed understanding 
of “real-world” benefits and risks of medical products. 
This network should also support faster, more compre- 
hensive approaches to risk-benefit issues for food, where 
considerations of nutritional value, 
supply-chain data, presence of toxic 
substances, whole-genome pathogen 
sequencing, and impact of inten- 
tional gene modification can be inte- 
grated with health outcomes. In the 
face of global climate change, a safe, 
resilient, and effective food system 
is a priority. Involving consumers 
and patients directly in the process 
of FDA's technology development 
represents a major opportunity to 
harness social science and human 
behavioral research and engage with 
the public more meaningfully. 

Misinformation has undermined the credibility of 
science and evidence-based processes that improve so- 
cietal well-being. The FDA is working to restore public 
confidence by prioritizing clear, direct, and transparent 
communication as a priority. However, broader efforts 
spanning the biomedical ecosystem are needed to create 
an effective consortium that ensures access to truthful in- 
formation about biomedical science and health. 

Although FDA’s primary mission is to safeguard the 
well-being of Americans, it operates in a global environ- 
ment. FDA will continue to work with scientific, public 
health, and regulatory communities around the world to 
establish robust information systems for monitoring viral 
and biological environments, disease transmission, and 
the safety of medicines and food produced outside the US. 
COVID-19 has been a clarion call for public health agen- 
cies and biomedical scientists worldwide to collaborate as 
never before. 

—Robert M. Califf 


Robert M. Califf 
is commissioner 


of the US Food and 
Drug Administration, 
Silver Spring, MD, 
USA. commissioner 


@fda.hhs.gov 
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U.S. CDC Director Rochelle Walensky, saying she plans to reorganize the agency because 
of shortcomings during the pandemic. She promised more accountability and timeliness. 


hrank a reser eveal a 
prehistoric monument in Spain this month. 


iS 


Drought exposes ‘Spanish Stonehenge’ for study 


cientists are rushing to examine a 7000-year-old 
stone circle in central Spain that had been drowned 
by a reservoir for decades and was uncovered after 
the drought plaguing Europe lowered water levels. 
Nicknamed the “Spanish Stonehenge’—although 
2000 years older than the U.K. stone circle—the 
Dolmen of Guadalperal (above) was described by archaeo- 
logists in the 1920s. The approximately 100 standing 
stones, up to 1.8 meters tall and arranged around an oval 
open space, were submerged in the Valdecafias reservoir 
after the construction of adam on the Tagus River in 1963. 


Joe Biden’s chief medical adviser. In a 


The water has receded a few times since, most recently in 
2019, when archaeologists worked to create a digital re- 
cord of the site. This time they hope to better understand 
engravings on the stones, which include a human figure 
and a squiggly line, and document any further damage to 
the monument’s porous granite. The drought has uncov- 
ered other historic sites across Europe, such as a Roman 
fort in Spain, World War IJ-era German warships in the 
Danube River, and “hunger stones’—bearing dates en- 
graved by people suffering from famines caused by past 
droughts—in the Danube, Elbe, and other rivers. 


a member of former President Donald 


Fauci sets a date to step down 


LEADERSHIP | Anthony Fauci, the physi- 
cian and immunologist who has led the 
$6.3 billion U.S. National Institute of 
Allergy and Infectious Diseases (NIAID) 
for 38 years and has been an ardent, 
embattled voice for scientific evidence 
during the COVID-19 pandemic, will 
leave government service in December. 
Fauci, 81, is also resigning as chief of an 
NIAID immunology lab and as President 
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statement, he said he intends to continue 
to mentor future scientific leaders. He 
told The Washington Post that his plans 
also include writing a book and teach- 
ing. The tireless, blunt Brooklyn native 
has served under seven presidents and 
helmed NIAID from the early days of the 
HIV/AIDS epidemic in the 1980s through 
the 2001 anthrax attacks, the 2009 swine 
influenza pandemic, and outbreaks of 
West Nile, Ebola, and Zika viruses. As 


Trump’s White House Coronavirus Task 
Force, Fauci became an icon in the 
United States and worldwide, working 
to counter Trump’s public misstatements 
about the pandemic. Fauci also clashed 
with Republican lawmakers such as 
Senator Rand Paul (KY) over pandemic 
public health measures and their base- 
less assertions that the SARS-CoV-2 virus 
originated in a lab in Wuhan, China, that 
had received NIAID funding. 
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Gene therapy for a blood disease 


BIOMEDICINE | The U.S. Food and Drug 
Administration last week approved a 
genetic treatment for the blood disorder 
beta-thalassemia, marking the third U.S. 
gene therapy for a rare disease. The dis- 
order causes low hemoglobin and severe 
anemia, and the regular blood transfusions 
used to treat it can cause iron buildup 

that damages organs. The new treatment, 
Zynteglo, from manufacturer bluebird bio, 
relies on a virus to deliver a gene for hemo- 
globin into the patient’s bone marrow cells, 
grown in culture; the cells are then infused 
back into the body. In clinical trials, 89% of 
treated patients no longer required trans- 
fusions. Zynteglo won European approval 
in 2019 but was removed from the market 
after countries balked over the high price; 
in the United States it will cost $2.8 mil- 
lion per one-time treatment, making it one 
of the most expensive drugs ever. Bluebird 
is testing a different product that uses 

the same method for sickle cell anemia, 
which is more common in the United 
States than thalassemia. 


How rabbits invaded Australia 


GENETICS | Most of Australia’s rabbits, 
which have become a scourge of crops and 
native plants, descended from a single 
introduction by a farmer in 1859, a genetics 
study has found. Rabbits were repeatedly 
brought to Australia, including aboard the 
first fleet of British ships to reach Sydney, 
in 1788. But the fateful introduction came 
in 1859, when relatives in England of 
Thomas Austin, a wealthy settler, sent 

him bunnies that he used to establish a 
colony on his estate outside of Melbourne. 
These animals may have had a leg up in 
colonizing the continent: The DNA of their 
contemporary progeny includes a large 
amount from wild ancestors, which may 
have given Austin’s brood an adaptive edge 
in Australia, says the study in this week’s 
issue of the Proceedings of the National 
Academy of Sciences. The findings may 

aid efforts to find new ways to control the 
country’s rabbit populations and perhaps 
eradicate them. 


A cheaper way to save forests 


CONSERVATION | To get the most for their 
money, forest conservationists should target 
areas where relatively small investments 
could protect lots of species, rather than 
pushing to preserve a specific percent- 

age of the landscape, a study argues. The 
analysis, published by economists in the 

17 August issue of Nature, comes as many 
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advocates are urging nations to adopt a 
new goal of protecting 30% of their lands 
by 2030. That “30x30” goal risks wasting 
limited resources, the authors say. Instead, 
they offer a 50-year plan, based on a study 
of 458 forested regions, that calls for first 
saving relatively species-rich forests where 
conservation costs are low. They found that 
limiting deforestation in the plan’s first year 
in just 18 regions—including in Turkey’s 
Anatolian Peninsula and in Melanesia, 

for example—would produce the greatest 
benefit. Their plan would protect a total of 
46 regions within a decade. It gives a lower 
priority to saving forests where costs would 
be high, such as near cities in Brazil. 


Mum’s the word on flawed papers 


PUBLISHING | One-third of 330 top-ranked 
scientific journals do not publish outsiders’ 
critiques of papers after they appear, even 
though most of these journals belong to a 
publishing organization, the Committee on 
Publication Ethics, that encourages mem- 
bers to publish critiques, a study found. 
Among the majority that do run critical 
letters, commentaries, or online comments, 
many impose deadlines and limit length. 
Together, these choices by both types of 
journals raise barriers to correcting flawed 


2019 Event Horizon Telescope image 


ASTRONOMY 


5% 


Portion of monkeypox tests that 
came back positive in 200 men 
who have sex with men and did not 
have symptoms of the disease. 
It's unclear whether these men, who 
participated in a screening program 
in France, could transmit 
monkeypox. But the study authors 
say vaccination campaigns should 
not be limited to people who 
have had contact with symptomatic 
cases. (Annals of Internal Medicine) 


papers, the study’s authors write this week 
in Royal Society Open Science. In the 

207 journals that accepted comments, only 
about 2% of 2066 randomly selected papers 
mentioned the existence of a relevant, 
postpublication critique, the team found. 
Its study analyzed the 15 journals with the 
highest journal impact factors in each of 
22 scientific disciplines. All 15 journals in 
clinical medicine published critiques; only 
two math journals did. 


2022 photon ring image 


Black hole’s ‘photon ring’ unveiled 


ike art restorers discovering a hidden image under an old master, astro- 
physicists have reprocessed data used to create the first image of a black hole 
and sifted out only the light from its “photon ring.” The crisp, bright circle 
(above, right) shows photons held in a tight orbit near the edge of the event 
horizon, the point at which even light cannot escape the black hole’s gravity. To 
reveal the ring, researchers reanalyzed the now-iconic image of the supermassive 
black hole in the nearby galaxy M87, released in 2019 by the Event Horizon Telescope. 
That image (above, left) shows a fuzzy, fiery band of light that combines photons 
escaping from the ring and emissions from matter swirling around the black hole. 
New algorithms were able to tease out just the photons that came from the ring. 
The researchers reported last week in The Astrophysical Journal that the ring’s size, 
structure, and unvarying nature closely match theoretical predictions, confirming the 
ability of strong gravity to bend light into a tight curve. 
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NEWS 


Ancient DNA from the Near East 
probes a cradle of civilization 


Studies seek clues to origins of farming, early languages 


By Andrew Curry 


ew places have shaped Eurasian his- 

tory as much as the ancient Near East. 

Agriculture and some of the world’s 

first civilizations were born there, and 

the region was home to ancient Greeks, 

Troy, and large swaths of the Roman 

Empire. “It’s absolutely central, and a lot of 

us work on it for precisely that reason,” says 

German Archaeological Institute archaeo- 

logist Svend Hansen. “It’s always been a 

bridge of cultures and a key driver of innova- 
tion and change.” 

But one of the most powerful tools for un- 

raveling the past, ancient DNA, has had little 
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to say about this crucible of history and cul- 
ture, in part because DNA degrades quickly 
in hot climates. 

Now, in three papers in this issue, re- 
searchers present DNA from more than 
700 individuals who lived and died in the 
region over more than 10,000 years. Taken 
together, the studies survey the history of the 
Near East through a genetic lens, exploring 
the ancestry of the people who first domesti- 
cated plants and animals, settled down into 
villages, spread the precursors of modern 
languages, and peopled Homer’s epics. 

The massive data set includes DNA from 
burials stretching from Croatia to modern- 
day Iran, in a region the authors call the 


Researchers sampled DNA from individuals including 
this man, buried about 8000 years ago in Turkey. 


Southern Arc. “The sample size is phenom- 
enal, and fascinating,’ says Wolfgang Haak, a 
geneticist at the Max Planck Institute for Evo- 
lutionary Anthropology who was not part of 
the team. “The beauty of this is it’s bringing it 
all together in a bigger narrative.” 

That narrative is no simple tale. The genet- 
icists, led by David Reich and Iosif Lazaridis 
of Harvard University, worked with archaeo- 
logists and linguists, gathering thousands 
of skeletal samples and extracting and ana- 
lyzing DNA, mostly from the dense petrous 
bone of the ear, over nearly 4 years. They 
applied better extraction methods and com- 
pared new samples with existing data, allow- 
ing them to identify even short bits of DNA. 

Their genetic story starts with the early 
days of farming, a period known as the Neo- 
lithic. Farming began in Anatolia in what is 
present-day Turkey. But the DNA shows that 
the people who experimented with planting 
wheat and domesticating sheep and goats 
starting about 10,000 years ago weren't 
simply descendants of earlier hunter-gath- 
erers living in the area. Dozens of newly se- 
quenced genomes suggest Anatolia absorbed 
at least two separate migrations from about 
10,000 to 6500 years ago. One came from 
today’s Iraq and Syria and the other from 
the Eastern Mediterranean coast. In Anato- 
lia they mixed with each other and with the 
descendants of earlier hunter-gatherers. By 
about 6500 years ago, the populations had 
coalesced into a distinct genetic signature. 

Another genetic contribution came from 
the east about 6500 years ago, as hunter- 
gatherers from the Caucasus entered the 
region. Then about 5000 years ago, a fourth 
group—nomads from the steppes north 
of the Black Sea, known as the Yamnaya— 
arrived, adding to the genetic picture but not 
fundamentally redrawing it. “The people of 
the Southern Arc are mostly coming from 
Levantine, Anatolian, and Caucasus compo- 
nents,” Lazaridis says. “The Yamnaya are like 
a layer of sauce, added after 3000 B.C.E.” 

This scenario supports existing evidence 
that agriculture arose in a network of people 
interacting and migrating in this region, oth- 
ers say. “This fits really well with archaeo- 
logical data,’ says Barbara Horejs, scientific 
director of the Austrian Archaeological Insti- 
tute, who was not part of the team. 

But other scholars question the team’s 
conclusion about the origins of a different 
cultural shift, the spread of Indo-European 
languages. Nearly every language spoken in 
Europe today stems from a common root, 
shared with Indian languages. Researchers 
have for years traced it to the Bronze Age 
Yamnaya, who rode both east and west from 
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the steppes. But the authors of the new pa- 
pers argue the Black Sea steppe wasn’t the 
birthplace of Indo-European, but rather a 
stop along a journey that began earlier and 
farther to the south, perhaps around mod- 
ern-day Armenia. 

Because of similarities between Indo- 
European and Anatolian languages such 
as ancient Hittite, linguists had guessed 
the Yamnaya had left both genes and lan- 
guage in Anatolia, as well as Europe. But 
the new analysis finds no Yamnaya ancestry 
among ancient Anatolians. The team sug- 
gests they and the Yamnaya instead share 
common ancestors in a hunter-gatherer 
population in the highlands east of Anato- 
lia, including the Caucasus Mountains. That 
area, they argue, is the most likely place for 
people to have spoken an Anatolian-Indo- 
European root language, perhaps between 
5000 and 7000 years ago. “That Caucasus 
component is a unifying type of ancestry we 
find in all places where ancient Indo-Eu- 
ropean languages are spoken,” says Lazari- 
dis, who is first author on all three papers. 

However, Guus Kroonen, a 
linguist at Leiden University, 
says this contradicts linguistic 
data. The early people of the 
Caucasus would have been fa- 
miliar with farming, he says, 
but the deepest layers of Indo- 
European have just one word 
for grain and no words for 
legumes or the plow. Those 
speakers “weren’t very famil- 
iar with agriculture,” he says. 
“The linguistic evidence and 
the genetic evidence don’t seem to match.” 

Lazaridis says it’s possible the root tongue 
“was originally a hunter-gatherer language,” 
and so lacked terms for farming. The team 
agrees more evidence of “Proto-Indo- 
Anatolians” is needed, but says the Caucasus 
is a promising place to look. 

Throughout, the papers address some 
critiques of previous ancient DNA work. 
Some archaeologists have complained 
that earlier research attributed almost 
everything—status, identity, power shifts— 
to pulses of migration recorded in DNA. But 
the new papers acknowledge, for example, 
that some migrations into Anatolia may not 
have been relevant or even perceptible to 
those living at the time. “That’s a response 
to criticisms coming from the archaeologi- 
cal literature,” says Hartwick College ar- 
chaeologist emeritus David Anthony, who 
is not a co-author but has worked with the 
team. “It’s really healthy.” 

In another example, Yamnaya were bur- 
ied in elite tombs after they moved into 
the region north of Greece, suggesting a 
link between ancestry and social status. 
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“The beauty 
of this is it’s 
bringing it all 
together ina 


bigger narrative.” 


Wolfgang Haak, 
Max Planck Institute for 
Evolutionary Anthropology 


But during the later Mycenaean period in 
Greece—the time Homer mythologized—the 
new data suggest Yamnaya descendants had 
little impact on Greek social structure. 

Evidence comes in part from the spec- 
tacular Mycenaean burial of the Griffin 
Warrior, a man who died in 1450 B.C.E. near 
Pylos, Greece. He carried no traces of steppe 
ancestry, though dozens of both elite and 
humbler graves in Greece did. University of 
Cincinnati archaeologist Shari Stocker, who 
helped excavate the tomb in 2015 and col- 
laborated on the new studies, says the lack 
of correlation between social status and 
steppe ancestry is no surprise—and a wel- 
come dose of nuance from geneticists. 

The papers also acknowledge the nuances 
of identity in later periods, for example in 
Imperial Rome. Previous genetic studies 
had shown that as the empire coalesced, the 
ancestry of people in and around the city of 
Rome shifted, with most having roots not in 
Europe, but farther east. 

After obtaining dozens of additional 
Roman-era genomes from the region, the 
team zeroed in on the source 
of those newcomers: Anatolia. 
But the researchers agree that 
people with “Anatolian” DNA 
moving to the Italian peninsula 
likely saw themselves as citi- 
zens or slaves of Rome, rather 
than as part of a distinct “Ana- 
tolian” ethnic group. Contem- 
porary chroniclers remarked 
on the new faces in Rome—and 
referred to many of them as 
“Greeks,” perhaps because the 
eastern peoples had spoken Greek for centu- 
ries, Lazaridis says. 

Some archaeologists still think the papers 
claim too much influence for ancestry. “DNA 
cannot tell us anything about how people 
shape their life worlds, what their social sta- 
tus was,’ says archaeologist Joseph Maran 
of Heidelberg University. He says terms like 
“Yamnaya ancestry” suggest the Yamnaya 
spread by moving directly from place to 
place, rather than through a complex min- 
gling of their descendants with local popu- 
lations over centuries or more. “Equating 
history with ‘mobility’ and ‘migrations’ is ... 
old-fashioned.” 

And although the studies are a big step 
forward, in covering 10,000 years with 700 
samples, they leave plenty of questions un- 
answered, with large stretches of time and 
space represented by a handful of samples. 

All the same, several archaeologists in- 
cluding Horejs think this injection of DNA 
data will shape research going forward. “It’s 
our task now, and obligation as archaeolo- 
gists, to use this new data to rethink archae- 
ological models,” she says. 


PLANT ECOLOGY 


Global drought 
experiment 
reveals the toll 
on plant growth 


Artificial droughts sharply 
cut carbon storage 


By Elizabeth Pennisi 


urope and many other parts of the 

world are currently grappling with 

extreme drought—and that could be 

bad news for efforts to curb climate 

change, concludes a new global study 

of how shrubs and grasses respond to 
parched conditions. 

Grasslands and shrublands cover more 
than 40% of Earth’s terra firma, and they 
remove hefty amounts of carbon dioxide 
from the air. But by deliberately blocking 
precipitation from falling at 100 research 
sites around the world, researchers found 
that a single year of drought can reduce the 
growth of vegetation by more than 80%, 
greatly diminishing its ability to absorb car- 
bon dioxide. Overall, plant growth in the ar- 
tificially drought-stricken grassy patches fell 
by 36%, far more than earlier estimates. But 
the study, presented last week at the annual 
meeting of the Ecological Society of America 
in Montreal, also found great variability: Veg- 
etation at 20% of the sites continued to thrive 
despite the lack of water. 

“T was surprised at how much drought im- 
pacts varied,” says Drew Peltier, a physiological 
ecologist at Northern Arizona University who 
was not involved in study. “This suggests 
there is some resilience in these systems; the 
question is how much and for how long.” 

A decade ago, with droughts forecast to 
become more frequent and severe in a warm- 
ing world, three ecologists—Melinda Smith 
of Colorado State University; Osvaldo Sala of 
Arizona State University, Tempe; and Richard 
Phillips from the University of Indiana, 
Bloomington—grew frustrated with their 
field’s inability to come up with consistent 
results about how dry weather affects plant 
productivity, particularly in grasslands and 
shrublands. So, they and their colleagues 
hammered out a standardized procedure for 
creating artificial droughts in the field and 
put out a call for researchers willing to partic- 
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ipate in what they dubbed the International 
Drought Experiment (IDE). 

“We expected to have about 20 sites,’ Smith 
recalls, but what’s called Drought-Net has 
grown to 139. Some are in places, such as Iran 
and parts of South America, where scientists 
had conducted little drought research. Most 
are in shrub- and grasslands, where it’s easier 
to erect structures to block precipitation. 

Each team agreed to re-create the condi- 
tions of the worst drought documented in 
their region over the preceding century. Most 
blocked precipitation by mounting plastic 
roofing slats over 1-meter squares of ground; 
the slats were spaced according to how much 
rain, sleet, or snow needed to be diverted. On 
average, the roofed plots received less than 
half of their typical precipitation. 

Each team tallied the kinds and numbers 
of plants in the covered areas, as well as in 
similar plots left open for comparison. After 
a year of treatment, the researchers surveyed 
the plants again and harvested, dried, and 
weighed all of the aboveground plant mate- 
rial in the roofed and open plots. 


dominated plots fared better than those dom- 
inated by grasses, Wilkins reported. Shrubs 
tend to have more extensive roots that can 
reach moisture deep in the soil. The average 
decline seen in the grassy plots—36%—is “al- 
most twice as much of a reduction as other 
studies have shown,” notes Elsa Cleland, an 
ecologist at the University of California, San 
Diego. But she and others think the data are 
believable because the study used standard 
methods across a wide variety of sites. 

Many researchers have continued to moni- 
tor their plots, with some planning to collect 
data for four or more years, in part to simu- 
late prolonged droughts. The additional data 
could help climate modelers sharpen esti- 
mates of how much less carbon is absorbed 
by shrub- and grasslands in a drought, says 
Sarah Evans, an ecologist at Michigan State 
University’s W.K. Kellogg Biological Station. 
IDE results could also help ecologists forecast 
which ecosystems are most at risk during dry 
spells, as well as broader ecological ripple ef- 
fects. Less plant matter can mean less food 
for grazing animals such as rodents and for 


This year, drought scorched these soccer fields in London and stressed shrubs and grasses across the globe. 


Last week, the researchers reported initial 
results from 100 shrubby and grassy sites. 
At some, such as plots of shortgrass prairie 
in Colorado, there was “catastrophic loss,” 
reported Kate Wilkins, a grassland ecologist 
now at the Denver Zoo who worked with 
Smith. Plant productivity in the water-starved 
area declined by 88%. “What surprised me 
was just how dead it was,’ Wilkins said. 

In contrast, in a temperate grassland in 
Germany the simulated drought “did not 
have any significant effect,’ says disturbance 
ecologist Anke Jentsch-Beierkuhnlein from 
the University of Bayreuth. In general, the cli- 
mate at the German site was wetter and the 
drought less severe than on the prairie. 

Overall, plants in wetter environments 
withstood this short-term drought bet- 
ter than those in drier climes, and shrub- 
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their predators, Evans notes. “The health of 
many ecosystems and their biodiversity relies 
on plant production,’ she says. 

Farmers, ranchers, and land managers 
might also benefit. Jentsch-Beierkuhnlein 
notes that during the current European 
drought, intensively managed grasslands 
with relatively few species, such as hayfields, 
have been hard hit. Planting more diverse 
assemblages might enable such grasslands 
to “keep delivering ecosystem services even 
under severe drought,’ she says. 

That’s an important insight, says Andrew 
Hector, an ecologist at the University of Ox- 
ford, given the extreme heat and drought of 
recent years. “The main message of these ex- 
treme conditions is that climate change ... is 
happening already,’ he says. They “show just 
how relevant [the IDE] is.” = 


ATMOSPHERIC SCIENCE 


Researchers 
watch how 
Arctic storms 
chew up sea ice 


Airborne campaign to study 
summer cyclones could 
reveal air-ice interactions 


By Eric Hand 


he storm began somewhere between 

Iceland and Greenland, as distur- 

bances high and low in the atmosphere 

united into a full-fledged cyclone. One 

day later, the vast spiral of winds had 

grown nearly as big as Mongolia. It 
was on a beeline for Svalbard, the archipel- 
ago between Norway and the North Pole, and 
heading for the thin floes girding the Arctic’s 
vulnerable pack of summer sea ice. And that 
made John Methven very, very happy. 

Last week, Methven, an atmospheric dy- 
namicist at the University of Reading, flew 
through the storm as part of an airborne 
campaign based out of Svalbard’s Longyear- 
byen, the world’s northernmost town. As his 
Twin Otter plane shuddered through tropi- 
cal storm-force winds of 100 kilometers per 
hour, flying just 15 to 30 meters above the 
sea surface, Methven and the crew took mea- 
surements of the ice, water, and air before 
returning to a bumpy landing on Svalbard. 
It was the third, and strongest, cyclone that 
U.K., US., and French teams had captured in 
a monthlong effort. 

“Tt’s really exciting to get this sequence 
[of cyclones],” says Methven, leader of the 
U.K. component of the Thin Ice campaign, 
the first airborne project to study how these 
summertime storms affect sea ice. “People 
are going to be pretty pleased.” 

With data from the ice-skimming plane, 
a second aircraft flying through the tops of 
the storms, and dozens of weather balloons, 
the Thin Ice teams hope to learn how these 
common but poorly understood storms 
form, function, and chew up sea ice. They 
also plan to gauge how the properties of the 
sea ice—smooth, rough, or missing—feed 
back into the storms themselves. The data 
should help improve Arctic weather models 
and sharpen the picture of how summer 
cyclones may be accelerating the retreat of 
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Arctic sea ice, already on the run because of 
global warming. 

The storms whip up waves that menace 
Arctic fishing vessels and send storm surges 
into coastal villages. “A lot of these com- 
munities are having to move,” says Julienne 
Stroeve, a polar scientist at the University of 
Manitoba (U of M). “They’re falling into the 
ocean.” The cyclones also threaten the cargo 
and cruise ships rushing to take advantage of 
newly ice-free passages in the summer. Bet- 
ter models will “make it safer” to travel the 
region, says Alex Crawford, an Arctic climate 
scientist at U of M. “You'll have a better clue 
to stay in port or go on.” 

Summertime Arctic cyclones are very dif- 
ferent beasts from tropical cyclones: not as 
powerful but sometimes larger. The aptly 
named Great Arctic Cyclone of 2012 stretched 
5000 kilometers across, spanning the entire 
Arctic Ocean. With little topographic relief to 
disrupt them, the storms can wander around 
the Arctic Ocean for weeks on end. “There’s 
nothing to get rid of them,” Methven says. 

Hurricanes are fueled by the energy in 
water vapor rising from a warm ocean, but 
Arctic cyclones get their spark from hori- 
zontal temperature differences. At high al- 
titude, kinks in the polar vortex, a collar of 
winds 5 to 8 kilometers up that keeps warm 
midlatitude air separated from cold Arctic 
air, can start air spinning. Near the surface, 
temperature differences between the ocean 
and the ice front, or between land and the 
ocean, can do the same. When a low-level 
spin-up meets up with one at the top, they 
intensify into a cyclone. Other Arctic cyclones 
are imports—storms from lower latitudes 
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that wind up in the “garbage bin” of the Arc- 
tic, Crawford says. 

Unlike hurricanes, Arctic cyclones blow 
across an ocean partly covered by sea ice— 
with complex consequences for both winds 
and ice. Early in the summer, the storms’ 
cloud cover can inhibit melting. But by Au- 
gust, as ice thins near the edge of the pack, 
cyclones can speed melting by pushing 
floes to warmer waters, breaking up ice into 
smaller floes that melt more easily, and creat- 
ing waves that stir up warmer waters. Mean- 
while, the rough surface of the ice can act 
as a brake on the winds. Yet the friction can 
also help a storm persist by keeping its core 
stable, Methven says. 

Weather and climate models struggle to 
forecast both the storms and their interac- 
tions with sea ice. In early August, two lead- 
ing models differed by a full day in when 
they predicted a major storm would arrive, 
says Jim Doyle, an atmospheric scientist 
at the Naval Research Laboratory and the 
leader of the U.S. component of Thin Ice. 
Methven says the U.K. Met Office’s model 
creates storms that tend to melt summer ice 
too fast, whereas the model at the European 
Centre for Medium-Range Weather Forecasts 
leaves too much ice lingering. 

The models perform poorly in part because 
data on Arctic conditions are relatively scant, 
with few weather stations. The models also 
struggle with the physics of Arctic clouds, 
which often contain a mix of frozen and liq- 
uid droplets. “Getting the balance between 
the liquid and ice phase is really, really hard,” 
says Ian Renfrew, a meteorologist at the Uni- 
versity of East Anglia. Thin Ice’s high-flying 


The Great Arctic Cyclone of 2012 spanned the Arctic 
Ocean and erased 500,000 square kilometers of ice. 


aircraft will help tune models by gathering 
detailed cloud data from within the storm. 

Modelers are also eager for surface-level 
data, especially along the rough, busted-up 
perimeter of the ice pack, a region called 
the marginal ice zone. In the past few years, 
Renfrew says, a few models have begun to in- 
clude a parameter to account for the rough- 
ness of the marginal ice instead of treating 
it as uniformly smooth. That seems to im- 
prove the models’ forecasts of cyclones and 
ice loss, but researchers don’t know whether 
their parameter matches reality. By directly 
measuring the roughness of the ice and how 
its friction pushes back on storms, the ice- 
skimming flights should help models forecast 
the complex interplay of winds and ice. 

The storms are a major driver of sea-ice 
retreat. The 2012 Great Arctic Cyclone de- 
stroyed 500,000 square kilometers of ice—an 
area the size of Spain, says Steven Cavallo, 
an atmospheric scientist at the University 
of Oklahoma, Norman. Cyclones routinely 
destroy a couple hundred thousand square 
kilometers of ice and could ultimately be re- 
sponsible for up to 40% of annual ice losses, 
he says. “We think it’s pretty significant. And 
it’s growing.” 

Doyle doesn’t see any sign that climate 
change is creating stronger or more frequent 
summertime storms, in recent decades at 
least. But he says warming makes the ice 
more vulnerable to the regular parade of cy- 
clones. “The ice is thinning, and so the Arctic 
cyclones are having a much bigger impact.” 

Models suggest the Arctic will lose all its 
summer sea ice by 2050, if not sooner. How 
that will affect the summer storms is “the 
million-dollar question,” says Elina Valkonen, 
an atmospheric scientist at Colorado State 
University. Competing forces are at work. 
The open, warmer ocean is expected to pro- 
vide more moisture and fuel for storms, but 
it would also reduce the low-level spin-ups 
that spark them, by eliminating temperature 
gradients at what was once the ice front and 
between ocean and land. 

In unpublished work, Valkonen and col- 
leagues looked at scenarios for the year 2100 
from a set of models tuned for an ice-free Arc- 
tic. They found no change in the predicted 
barometric pressure for the summer storms, 
which defines their strength. And although 
the number of storms rose slightly, that was 
only due to imported storms from lower lati- 
tudes, not cyclones generated in the Arctic. 
Still, it might not all be good news. Without 
rough ice to slow them, the storms tended to 
be longer lasting, with faster winds, Valkonen 
says. “When youre a fisherman in the Arctic, 
that’s what you care about.” & 
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ANIMAL HEALTH 


Deadly bird flu establishes 
a foothold in North America 


H5NI has continued to kill wild birds and poultry this 
summer. The fall migration could bring it back in force 


By Erik Stokstad 


hen an outbreak of highly patho- 

genic H5NI1 avian influenza spread 

across North America this spring, 

researchers hoped for a replay of 

what happened after a different 

avian flu variant arrived in the 
United States in December 2014. Although 
more than 50 million birds died or were de- 
stroyed in a matter of months, costing farm- 
ers more than $1.6 billion, the virus had 
essentially vanished by June 2015. Poultry 
outbreaks ended, wild birds stopped dying, 
and migratory waterfowl didn’t bring the vi- 
rus back when they returned from their sum- 
mer breeding grounds in Canada. 

But this time is different. H5N1 infections 
in both wild bird species and poultry have 
continued in parts of the United States and 
Canada over the summer, dashing hopes 
that warmer temperatures would halt the 
spread. And whereas the 2015 outbreak 
primarily affected Midwest poultry farms, 
H5N1 has spread to practically the entire 
continental United States and infected at 
least 99 wild bird species, a record. Whether 
migratory birds will cause additional intro- 
ductions in the fall is “the million-dollar 
question,” says Bryan Richards, emerging 
disease coordinator at the U.S. Geological 
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Survey’s National Wildlife Health Center. 

Even if they don’t, scientists worry the virus 
may continue to circulate year-round, posing 
a permanent threat to poultry farming and 
wild birds, including several endangered spe- 
cies. “Impacts on wild birds may persist for a 
very, very long time,” Richards says. Europe 
may show what lies ahead: There H5N1 has 
already become a fixture in wild birds and 
has caused bigger and bigger outbreaks over 
the past 3 years, causing record damages to 
the poultry industry (Science, 13 May, p. 682). 

H5NI1 first emerged in poultry in China’s 
Guangdong province in 1996 and since then 
has caused several major outbreaks around 
the world. It has evolved to infect waterfowl 
species without causing significant harm, 
allowing the birds to spread the virus far 
and wide. In the current outbreak, water- 
birds are thought to have carried the virus 
to Canada from Europe and then down the 
eastern seaboard (Science, 29 April, p. 441). 
Bald eagles, owls, and other predators died 
after eating infected waterbirds. In February, 
H5NI1 reached the Mississippi flyway, where 
snow geese and other species migrate to 
northern Canada. Along the way it infected 
poultry operations, forcing farmers to cull 
40 million chickens and turkeys. Later in the 
spring, the virus slowly moved westward. 

By now, H5N1 has been detected in more 


Avian influenza devastated Caspian tern populations 
on Lake Michigan islands. 


than 2000 wild birds in the United States, 
compared with just 99 during the 2015 out- 
break; biologists suspect the virus is much 
more transmissible than its predecessors. “It 
has just exploded in the breadth of species 
that it’s observed in,’ says Wendy Puryear, a 
wildlife virologist at Tufts University. 

Infections began to fall in May, although 
some species continued to be afflicted. Black 
vultures, which pick up H5N1 when they 
scavenge carcasses, are still dying by the 
hundreds, says Rebecca Poulson, a wildlife 
disease researcher at the University of Geor- 
gia. “It’s still hitting those scavengers pretty 
hard,” she says. And in June, researchers in 
New England were surprised when a second 
wave of infections struck seabirds. “All of a 
sudden, it was like a switch had been flipped 
again,” Puryear says. 

Seabirds are particularly vulnerable be- 
cause many nest in dense colonies. North- 
ern gannet populations crashed in parts 
of eastern Canada, as they have in Europe. 
In Lake Michigan, Caspian terns—locally 
endangered—were very hard hit. H5NI1 rarely 
infects mammals, but this wave has killed 
hundreds of harbor seals in Maine; Puryear 
and colleagues are trying to learn whether 
the virus can spread between seals or they 
all were infected by birds or their feces. The 
United States and the United Kingdom have 
each seen one human H5N1 case so far. 

Now, all eyes are on the migratory birds, 
which fan out over a large area as they re- 
turn to the United States from the north and 
could spread the virus widely. Researchers 
with the Canadian Wildlife Service (CNS) 
have collected samples from 1000 snow geese 
on their Arctic breeding grounds, but testing 
them for H5N1 could take another month 
or two, says CNS waterfowl biologist Jim 
Leafloor. U.S. federal and state biologists are 
already testing live and hunter-killed migra- 
tory ducks and geese. 

Regardless of whether a new surge of in- 
fections arrives from the north, many re- 
searchers say the virus is already entrenched 
in some parts of the United States. If those 
areas overlap major poultry farming areas, 
the consequences could be serious. Farmers 
could face a constant risk of major losses, and 
Richards says they would need to maintain or 
tighten biosecurity measures—meticulously 
cleaning boots and equipment, for example. 

Much remains to be learned. In wild birds, 
for example, just how H5N1 moves from one 
individual to another and between species is 
still a mystery, says Tufts wildlife virologist 
Kaitlin Sawatzki. “It’s going to be a very com- 
plicated story,’ she says. “It’s hard to predict, 
and we're nervous.’ & 
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ASTRONOMY 


Many-eyed scope will make movies of the stars 


Argus Array will combine hundreds of off-the-shelf telescopes to capture fleeting events 


By Daniel Clery 


rgus Panoptes, the all-seeing, many- 

eyed giant of Greek mythology, is 

about to take physical form in the 

mountains of North Carolina. In Oc- 

tober, an array of 38 small telescopes 

will begin monitoring a slice of vis- 
ible sky 1700 times the size of the full Moon. 
Known as the Argus Array Pathfinder, it will 
register changes in the stars second by sec- 
ond, essentially making a nightlong celes- 
tial movie. Its developers hope it will pave 
the way for a much larger Argus Array with 
900 telescopes that by 2025 could watch the 
entire visible night sky. 

The Argus telescopes join others aiming 
to capture short-lived or rapidly changing 
astrophysical events, known 
as transients, including ex- 
ploding stars, ravenous black 
holes, neutron star merg- 
ers, and maybe even stars 
briefly eclipsed by the long- 
postulated hidden planet in 
our Solar System. The full 
Argus Array would watch 
the sky with more mirror 
area than all other transient 
telescopes put together, says 
team leader Nicholas Law of 
the University of North Caro- 
lina, Chapel Hill. 

“The potential is enor- 
mous,” says Igor Andreoni of 
the University of Maryland, 
College Park, who is not in- 
volved in the project. As well 
as catching real-time events, 
Argus will build an archive of 
images showing objects before they explode 
or change. “We'll know the history of any- 
thing that happens in the sky above a certain 
brightness,’ Andreoni says. “We’re entering a 
new era of time-domain astronomy with an 
explosion of different sorts of telescope de- 
sign,” adds Carole Mundell of the University 
of Bath. 

Argus aims to achieve its unique vision 
with hundreds of off-the-shelf telescopes, 
each just 20 centimeters across and watch- 
ing a different patch of sky. The final array 
will match the light-gathering power of a 
telescope with a single 5-meter mirror, which 
typically costs hundreds of millions of dol- 
lars, but cheap components should keep Ar- 
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gus’s cost below $20 million, Law says. The 
challenge will come in stitching together the 
array’s 900 images into a single, seamless 
movie of the night sky. “We’ve spent an awful 
lot of time on the data pipeline,’ Law says. 

In 2015, his team built a smaller instru- 
ment called Evryscope (Science, 3 July 2015, 
p. 14). That had 27 telescopes, each 7 centi- 
meters across, looking outward from the sur- 
face of a hemispherical dome. Its successes 
included spotting a stellar flare—larger than 
any seen before—from our nearest neighbor 
star Proxima Centauri, but the team wanted 
to scale up to see objects outside our Galaxy. 

Instead of looking out from a dome’s 
surface, the Argus telescopes will sit in a 
10-meter-wide bowl, all aiming out a single 
skylightlike window in a dome. Over the 
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The Argus Array’s 900 telescopes will each watch a different patch of sky for rapid changes, 
from supernovae to the passing shadow of the hypothetical Planet 9. 


course of the night, the bowl and telescopes 
will pivot slowly to follow the stars as Earth 
rotates. To capture quick-fire images, design- 
ers plan to replace the charge-coupled device 
(CCD) light sensors used in most telescopes 
with complementary metal-oxide-semicon- 
ductor detectors, which can read out data in 
less than a second compared with many sec- 
onds for CCDs. 

Grants totaling $1.3 million from the Na- 
tional Science Foundation (NSF) and Schmidt 
Futures, a private foundation, funded the 
38-scope prototype. Law and colleagues ex- 
pect to test it in the coming weeks before 
transferring it first to a site in the Appala- 
chian Mountains near Chapel Hill for debug- 
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ging and later to Mount Laguna Observatory 
in California. The team hopes to show off its 
capabilities before seeking NSF funding for 
the full Argus Array after the turn of the year. 

Data from Argus Pathfinder and its suc- 
cessor will be freely available in real time, 
and the software will issue automatic alerts 
when it detects an event. That will allow 
other, larger telescopes to quickly swivel to 
the same spot in the sky and collect more de- 
tailed data, a boon for astronomers probing 
stellar outbursts such as flares, supernovae, 
and gamma ray bursts. 

Observers normally don’t spot supernovae, 
for example, until hours after the event. “Get- 
ting closer in time means you get closer to 
the source” of the explosion, says Shrinivas 
Kulkarni of the California Institute of Tech- 
nology, a pioneer in efforts 
to capture transients. If the 
progenitor star is bright 
enough, Argus might also re- 
cord any sudden brightening 
or belches of gas before its 
death—possible precursors 
of the blast. “It makes history 
available in a very compre- 
hensive way,’ Kulkarni says. 

If Argus had been up 
and running in 2017, it 
might have given an early 
glimpse of the light flash 
from the first ever recorded 
kilonova—a merger of two 
neutron stars—and enabled 
other telescopes to home in 
on it quickly. As it happened, 
gravitational wave detectors 
were the first to sense the 
merger, but they can’t pin- 
point locations accurately and guide other 
telescopes. “We need simultaneous observa- 
tions,’ Mundell says. 

Argus might even spot the elusive Planet 
9, hypothesized to lurk in the outer Solar 
System. It may be too cold and faint to be 
seen directly. But as it moves across the sky 
it should make background stars briefly 
blink out. “Occultations are certainly a 
promising avenue when it comes to Planet 
9,” says Konstantin Batygin of Caltech, who 
with colleague Mike Brown proposed its 
existence in 2016 from gravitational influ- 
ences on other distant bodies. If Argus pulls 
off that discovery, it will certainly have lived 
up to its formidable namesake. 


Tracking 
drive 
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SPARKLING WATERS 


Tiny Caribbean crustaceans and their bioluminescent 
mating displays are shining new light on evolution 


n the 18th century, the French natu- 
ralist Godeheu de Riville was sailing 
across the Indian Ocean when he came 
upon a remarkable sight. The sea “was 
covered over with small stars; every 
wave which broke about us dispersed 
a most vivid light, in complexion like 
that of a silver tissue electrified in 
the dark,” he recounted in his journal. 
When de Riville examined the sparkling 
water with his microscope, he discovered 
that the “small stars” were tiny crustaceans 
now known as ostracods. 

Centuries later, in 1980, marine biologist 
James Morin was scuba diving just after 
sunset in the Virgin Islands when he no- 
ticed bright blue dots blinking on and off 
several meters away. When he shone his 
flashlight through the water, he saw scores 
of ostracods flitting across its beam. After 
multiple dives, he discerned that the flashes 
weren’t random. The ostracods lit up in spe- 
cific patterns in space and time, much like 
the courtship flashes of fireflies that light 
up summertime meadows. The realization 
changed the course of Morin’s career. 

Now a professor emeritus at Cornell 
University, Morin has spent the past 
4 decades working with a small, dedicated 
group of colleagues to unravel the myster- 
ies of what they describe as the most spec- 
tacular natural wonder that most people 
will never see. Male ostracods only display 
for about an hour, shortly after sunset on 
moonless nights in warm Caribbean seas. 
Most recreational divers don’t dive at 
night, and those who do tend to use lights, 
which prompt the creatures to switch off 
for the evening. 

No bigger than a grain of sand, ostra- 
cods abound in fresh and saltwater. “They 
are very cute but also sort of bizarre—like 
a cross between a crab 
and a tiny spaceship,” 
says Timothy Fallon, an 
evolutionary biochemist 
at the University of Cali- 
fornia (UC), San Diego. 


A long exposure 
captures ostracods 
in motion ona 
Bonaire reef, driven 
in part by currents. 
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By Elizabeth Pennisi 


Only seagoing ostracods are biolumines- 
cent, and it’s not their bodies that glow. 
Rather they spew out glowing mucus. In 
most of the world’s oceans, ostracods do 
this for defense—to startle and distract 
would-be predators. But in the Caribbean, 
and only in the Caribbean, as Morin and 
colleagues discovered, those bright blue 
dots can double as mating calls. Today, 
thousands of dives later, they believe those 
signals have driven Caribbean ostracods to 
diversify into more than 100 species. 

With modern genetic tools, they’ve been 
using these creatures to investigate the 
factors that wedge species apart, including 
sexual selection, driven by female prefer- 
ences; geographic isolation; and genetic 
drift—the accumulation of random genetic 
changes. In just the past 2 years, research- 
ers have figured out how to grow ostracods 
in the lab, a development that will allow 
them to dissect the molecular mechanisms 
of evolution in a way once possible only 
in more conventional lab animals such as 
nematodes and fruit flies. 

“The ability to ask interesting ques- 
tions about evolutionary patterns across 
multiple species is a powerful tool,” says 
Christopher Cratsley, a behavioral ecologist 
at Fitchburg State University who works 
on fireflies. Ostracods are an “elegant sys- 
tem” for doing so, he says. The mechanics 
and biochemistry of their light flashes are 
relatively simple, and many species of os- 
tracods overlap in small areas. Compared 
with other animals with complex mating 
rituals—songbirds, say—they may more 
readily yield clues about the forces that 
generate biological diversity. 


IN JAPAN, dried ostracods are popular as 
curiosities because they glow when re- 
hydrated. They’re called wmi-hotaru—“sea 
fireflies’—and in the first half of the 20th 
century, they caught the eye of Princeton 
University biochemist E. Newton Harvey. 


He used dried ostracods to work out the 
basic biochemistry of bioluminescence, 
which has evolved independently about 
100 times. Organisms as disparate as bac- 
teria, fungi, fish, and insects use it to evade 
predators, attract prey, or communicate 
with their own kind. For several, such as 
fireflies, it’s a means of courtship. 

In the Caribbean, the light show takes 
place underwater. When male ostracods 
sense the water is dark enough, they take 
off from the reef or seagrass bed where 
they spend most of their time and begin 
their display. Females swim toward the 
flashes, as do nonflashing males racing to 
intercept them. 

To watch the spectacle, sea firefly re- 
searchers in scuba gear position them- 
selves on the sea floor just after dark, using 
red lights to find their way. It’s an eerie ex- 
perience. At night, shallow reefs resound 
with snapping shrimp and the crunch of 
parrot fish chomping on coral. Deeper wa- 
ters are spookily quiet. 

In the early days, biologists used a night 
vision monocular attached to a VHS cam- 
era to capture the displays. The images 
were grainy and had a limited field of view. 
“It was very hard to see a whole display,’ 
says Gretchen Gerrish, one of Morin’s first 
sea firefly graduate students and now an 
evolutionary ecologist at the University of 
Wisconsin, Madison. The video equipment 
has improved, but even now researchers 
often take waterproof notes by writing 
on pieces of PVC pipe with a mechanical 
pencil. A rubber band, rolled down a fin- 
ger’s width after each line, helps them keep 
their place. After documenting a display, 
they swim to the flashes and scoop up the 
creatures with a net. 

Back in the lab—often a makeshift setup 
in a hotel room—they sort through their 
catch and examine their collected ostra- 
cods under a microscope to identify the 
species. Sometimes a subtle difference in 
the shape and size of the reproductive or- 
gans or in the relative proportions of body 
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parts is all that distinguishes one species 
from another. So far, they’ve named more 
than 20 species; about 100 more await for- 
mal description. 

By the early 2000s the work had re- 
vealed that the behavior of courting ostra- 
cods is surprisingly complex: Flashes can 
be dim, bright, or even different tints. They 
last from milliseconds to multiple seconds. 
And while generating them the ostracods 
move in species-specific ways—up, down, 
or on a slant—creating strings of flashes 
that range in length from less than 1 me- 
ter up to 30 meters. Everywhere Morin and 
his colleagues went they found new species 
and new behaviors. “It was really hard to 
comprehend the scale of unknown knowl- 
edge we were stumbling upon,” recalls 
Nicholai Hensley, an integrative evolution- 
ary biologist at Cornell. 


TO DELVE MORE DEEPLY into sea firefly bio- 
logy, Gerrish recruited nearly all the other 
sea firefly researchers to study the crea- 
tures, in a project that extended across five 
sites in the Caribbean between 2015 and 
2019. They teamed up with Martin Dohrn, 
a filmmaker known for capturing nature 
in poorly lit places, to develop an under- 
water camera system that records visible 
and infrared light at the same time. That 
enabled the researchers to see the blue 
flashes as well as the animals themselves. 
The flashes show up as visible light, but 
the only way to see the ostracods without 
drowning out the flashes is to use infrared. 
“Tt transformed our abilities to document 
the displays in the field,” Morin says. 

The group also began to tease apart re- 
lationships among ostracod species. Todd 
Oakley, an evolutionary biologist at UC 
Santa Barbara, and his team used RNA 
samples to sequence the “transcriptome,” 
or set of expressed genes, for Caribbean os- 
tracod species and compare them with the 
transcriptomes of other ostracods, includ- 
ing those of the Pacific and Indian oceans, 
which don’t use bioluminescence for court- 
ship. Based on the degree of similarity in 
the transcriptomes, they arranged each 
species on a family tree and used the num- 
bers of genetic differences between species 
as a “molecular clock” to determine when 
each originated. 

This work revealed that the ability to gen- 
erate light, presumably for defense at first, 
evolved about 197 million years ago, Oakley 
reported in a preprint posted on 13 April. 
The ostracod family tree indicated evolu- 
tion co-opted this defense mechanism for 
mating displays only once, about 151 million 
years ago. The early date came as a surprise 
to Oakley and others, who had assumed 
this behavior was a relatively recent innova- 
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Stars of the sea 


About 150 marine ostracod species, including Vargula hilgendorfii, native to Japan (below), produce 
light when disturbed. The sesame seed-size crustaceans secrete mucus and light-generating 
molecules, creating a blue glow that startles predators, often enabling the ostracod to escape. In the 
Caribbean, some ostracod males also flash in species-specific patterns to attract mates. 


Ostracod anatomy 
Bioluminescent ostracods have evolved compound 
eyes and a specialized organ that produces 


the molecules they use to generate light. 
Side view Cross section 


Compound 


Right 
shell 


Second antenna 
Side view, left shell omitted 


First 
antenna 


Mandible ——s) \_ 
a 


Light organ Copulatory 
(at right) limb 


A bright reaction 

The enzyme luciferase causes a three-amino-acid 
molecule called luciferin to react with oxygen and 
release a photon of light. 
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tion that arose after the Isthmus of Panama 
formed 3 million years ago, separating Ca- 
ribbean ostracods from their Pacific kin. It’s 
not clear why glowing courtship displays 
didn’t spread to the Pacific before the bar- 
rier formed, but one possibility is that ostra- 
cods simply don’t disperse widely. Another 
is that visual signals don’t work as well in 
murky Pacific waters. 

The consortium is now using ostracods 
as a new lens on one of the biggest ques- 
tions in evolutionary biology: What drives 
the formation of new species? Female 


Light source 

Separate storage compartments and dedicated 
rows of secretory nozzles in the light organ 
isolate luciferin from luciferase until a flash 

is needed. 
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Let there be light 

When the light organ squirts out luciferin 
and luciferase inside a ring of mucus, 

the two molecules react and make the mucus 
light up blue. 


choosiness about mates is one candidate. 
Such sexual selection can drive the evolu- 
tion of brighter colors or bigger horns in 
males, but there has been little evidence 
it can actually split one species into two. 
Oakley thinks his studies of biolumines- 
cence provide some of the first proof. He and 
his student Emily Ellis compared the rates 
at which new species form in ostracods that 
use their flashes as courtship displays with 
the rate in those that don’t. In principle, 
new species should arise more frequently 
in populations where sexual selection is at 
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play. And they do, Oakley’s team reported 
in 2016—not only in Caribbean ostracods, 
but also in insects, fish, and octopuses 
with bioluminescent courtship displays. 

Gerrish, too, has found evidence in os- 
tracods that sexual selection can drive di- 
versity. She and others had observed that 
when multiple ostracod species live close 
together, individual species’ displays be- 
come more distinctive than when the same 
species are found on their own, which 
maintains diversity by making coexisting 
species less likely to crossbreed. 

At a study site in Belize, Gerrish and 
her graduate student Nick Reda have also 
found evidence that distinctive displays 
might be pushing one species to divide into 
two. Most males of the species 
Photeros annecohenae swoop up- 
ward as they flash, painting their 
string of glowing dots in a consis- 
tent direction. But a few swoop 
downward, and this tendency 
seems to be increasing, suggest- 
ing enough females prefer this 
behavior to cause it to proliferate. 
In one seagrass bed along the reef 
where Gerrish has been working, 
50% of the males now exhibit this 
odd behavior, and in an adjacent 
bed that percentage has grown to 
70%, she and Reda plan to report 
in a paper later this year. As this 
mating display becomes more 
common, it’s more likely to iso- 
late this population, a key step 
toward speciation. 

On that same reef, Gerrish and 
Reda see hints that other well- 
known evolutionary forces are 
at play. Multiple populations of 
P. annecohenae that live along 
the reef exhibit a genetic gradi- 
ent, with quite distinct populations at ei- 
ther end, she, Reda, and colleagues plan to 
report later this year. They think random 
genetic drift is driving some of the differ- 
ences. But geographic isolation also seems 
to contribute. In places where storms have 
created short breaks in the reef, isolating 
populations on either side, genetic dif- 
ferences are greater than expected from 
the gradient. 


INCREASINGLY, ostracod researchers are 
delving into how evolution shaped the 
animals’ most distinctive feature—their 
flashes. Oakley and Hensley, in collabora- 
tion with Elizabeth Torres, an evolution- 
ary biologist at California State University, 
Los Angeles, have sequenced the ostracod 
genes for luciferase, the enzyme that adds 
oxygen to a molecule called luciferin to 
make light in all bioluminescent organ- 
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isms. “Every new species we look at, it is 
a new gene,” Oakley says. Moreover, those 
tiny differences in the ostracod luciferase 
“correlate with different types of signals,” 
as this enzyme’s activity can dictate the 
brightness, duration, and other features of 
each flash. The findings show how novelty 
at the molecular level can lead to behav- 
ioral changes that could foster the emer- 
gence of new species. 

Such studies would be easier in the lab, 
but ostracods “are notoriously difficult to 
culture,” says Trevor Rivers, a behavioral 
ecologist at the University of Kansas, Law- 
rence, who was an early student of Morin’s. 
The animals are very picky about tem- 
perature and light and dark regimens, and 


In this image from Panama, vertical strings of flashes, each created by an 
individual male ostracod, linger in the water. When one male begins to display, 
others join in, aiming to lure females. 


researchers aren’t even sure what most 
species eat. And their relatively slow life 
cycle, on the order of months, means a 
long wait to see whether the animals will 
breed in a particular setup. 

To tackle this challenge, Oakley and 
Jessica Goodheart, then a postdoc in his 
lab, focused on Vargula tsujii, a California 
species that is a close relative of the Carib- 
bean ostracods but does not use its glow for 
mating. For more than 2 years, their team 
tweaked the aquaria and the water flow, 
tried different feeding regimens (chicken 
livers proved better than white fish), and 
adjusted the number of ostracods per tank. 
In 2020, the researchers finally succeeded in 
getting a reproducing population. They and 
others hope to eventually do the same with 
Caribbean ostracods. But already, Gerrish 
says, “We're poised to run in a bunch of dif- 
ferent directions” to learn more about these 


creatures and the evolutionary forces that 
shaped them. 

Using lab-grown ostracods, Oakley’s team 
has now uncovered evidence for a key idea 
about the evolution of novelty: that new 
traits often spring from modifications of ex- 
isting ones. To control the secretion of the lu- 
minescent mucus, they found, ostracods rely 
on genes similar to those thought to be active 
in venom glands in centipedes and wasps, 
his graduate student Lisa Mesrop reported at 
a meeting this winter. Thus, when ostracod 
bioluminescence arose 197 million years ago, 
it wasn’t newly invented, but rather a novel 
application for an existing gene network. 

Another student in Oakley’s lab, Emily 
Lau, has found ostracods and terrestrial 
fireflies evolved the same mecha- 
nism for regulating light-generat- 
ing luciferin, most likely because 
chemical constraints limit what 
cells can do to control this very 
reactive molecule. Both sets of 
organisms stabilize luciferin by 
adding sulfur to its chemical 
structure, even though the sulfur- 
adding proteins are very differ- 
ent, she has found. Evolution ap- 
pears to be “constrained” to one 
solution, Lau says, suggesting the 
same mechanism is at work in 
other bioluminescent organisms 
as well. 

With ostracods now reproduc- 
ing in the lab, researchers hope 
to finally sequence the entire ge- 
nome. The ostracod genome is 
longer than our own and has re- 
petitive regions and other com- 
plexities that make it challenging 
to sequence from single specimens 
snared at sea. Success in the lab 
would pave the way for genetic 
engineering not yet possible in most multi- 
cellular bioluminescent organisms, Hensley 
says. He’s interested in why ostracods have 
hundreds of luciferase-like genes. By modi- 
fying these genes, he hopes to learn how 
they might fine-tune light production and 
whether they have additional functions. 

The genome might also enable research- 
ers to track down the enzymes that make 
luciferin and modulate its release. Com- 
paring those enzymes with their counter- 
parts in other bioluminescent organisms 
could provide “an opportunity to figure out 
what’s truly necessary to make biolumi- 
nescence,” Hensley says. That’s just one of 
many insights de Riville could never have 
imagined centuries ago when he marveled 
at those shining stars in the sea. 


S HTTPS://SCIM.AG/OSTRACODS 
See videos of ostracod mating displays. 
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Machine learning ecological networks 


Deep-learning tools can help to construct historical, modern-day, and future food webs 


By Eoin J. O'Gorman 


t is perhaps unsurprising that apex 
predators, such as whales, sharks, leop- 
ards, and tigers, also tend to be the rar- 
est species (J). This is largely because of 
the imperfect transfer of energy through 
each level in a food chain (2), which 
makes these carnivores more susceptible 
to starvation than herbivores, detritivores, 
or omnivores. Their survival also depends 
on having a large home range for them to 
roam far and wide to find the mates and 
resources needed to sustain their popula- 
tions (3). These vulnerabilities make them 
particularly susceptible to human activi- 
ties, such as habitat loss or being targeted 
by hunters for their trophy status. On page 
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1008 of this issue, Fricke et al. (4) adopt a 
network-based approach to establish how 
humans have disrupted apex predators 
and other mammalian fauna over the past 
130,000 years. 

Just as the poet John Donne mused that 
“no man is an island,’ the same can be said 
for all animals on Earth—that none can ex- 
ist in isolation, and all are part of a complex 
tangle of interacting consumers, resources, 
and competitors (5). These ecological net- 
works have been the focus of ecosystem 
ecologists for many decades, often requir- 
ing years of observation, stomach-content 
analyses, and experiments to quantify their 
complex structures. Describing the interac- 
tions in ecological networks is crucial for 
understanding the ecological surprises that 
may occur in ecosystems after environmen- 
tal change or human intervention. For ex- 
ample, the overhunting of sea otters led to 


an explosion of their sea urchin prey, whose 
overgrazing of kelp eliminated a swathe of 
other species that rely on these underwater 
forests for food and shelter (6). Such “tro- 
phic cascades” have been described in many 
ecosystems, where the loss of one species 
triggers biomass fluctuations across several 
trophic levels (6). Food webs are a central 
tool for tracking or anticipating changes 
along dominant pathways of energy flow, 
whereas network analysis helps to quantify 
the overall stability or resilience of an eco- 
system and to identify key hubs or vulner- 
abilities (7, 8). 

Some of the findings by Fricke et al. are 
unsurprising. For instance, their analy- 
sis shows that there are fewer mammal 
species today than there were in the Late 
Pleistocene around 130,000 years ago. 
However, some of their other findings are 
more transformative. They find that there 
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Fricke et al. found that species with more connections 
in a food web, such as an apex predator like the lion, 
are more vulnerable to extinction. 


has been a systematic demise of mammals 
with more connections in the food web, 
which can be explained by the greater sus- 
ceptibility to extinction and range contrac- 
tion of larger predators than smaller prey. 
This change has resulted in much-less-con- 
nected food webs than if the same number 
of species had been extinguished at random 
from the networks. Simpler food webs are 
typically shown to be less stable because 
they are more susceptible to further extinc- 
tions through energetic limitation (7) or a 
lack of redundancy in the choice of avail- 
able prey (9). Furthermore, these food web 
“collapses” were particularly prominent 
after the arrival of humans to the region. 
Of further concern is the observation that 
currently endangered mammals share this 
same characteristic of being the most highly 
connected organisms in the network, which 
would lead to further food web collapses if 
they became extinct. 

Fricke et al. also provide a valuable tem- 
plate for studying ecological networks in 
the past, present, and future. The complex- 
ity of food webs and the laborious task of 
describing their underlying connections 
have prevented their large-scale use in con- 
servation and biomonitoring of ecosystems 
(5, 10). Recent calls for incorporating eco- 
logical networks into biomonitoring have 
proposed ways of circumventing this ob- 
stacle by taking the species lists obtained 
from routine sampling and inferring the 
links from established databases, filling 
the gaps with modeling approaches (JO, 11). 
This “guesswork” is often built on the prin- 
ciple that body size is a key determinant of 
which species can interact with one another 
or that predators will consume phylogeneti- 
cally related prey or share the diet of a phy- 
logenetically related predator. Fricke et al. 
demonstrate how machine-learning tech- 
niques can be applied to a broad suite of 
traits to identify which species are likely to 
interact. Their deep-learning algorithm out- 
performs the allometric and phylogenetic 
models in common use by increasing the 
accuracy with which feeding interactions 
can be predicted. Although using machine- 
learning methods to study and model com- 
plex food web structures is not a new en- 
deavor (8, 12), it is yet to become a standard 
tool within the field. 

Historical networks can no longer be ob- 
served, but their structure has been eluci- 
dated using ancient DNA, inferences from 
morphology, and models based on body 
size (13). Integration of machine-learning 
methods should progressively improve the 
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reconstruction of paleo networks, enhanc- 
ing the understanding of how ecosystems 
functioned in the past, how they differed 
from today, and how they responded to 
catastrophes. 

For studying modern-day ecosystems, 
automation of deep-learning algorithms 
would increase the accessibility of network 
science for conservationists and monitor- 
ing agencies, helping to identify critical 
nontarget species or bioindicators of harm- 
ful change to an ecosystem. The approach 
could also be used to detect “potential prey” 
that are underrepresented in current food 
webs (12). For instance, DNA analysis of 
fish diets has revealed that gelatinous salps 
(a tunicate) could be a key alternative re- 
source to declining krill in Southern Ocean 
food webs, despite years of underapprecia- 
tion due to the rapid digestion of their soft 
bodies in fish stomachs (14). 

Looking to the future, a major limitation 
at present is the inability to accurately fore- 
cast network responses to species extinc- 
tions or climate change. Now, it is typically 
assumed that a predator will go extinct if its 
primary prey disappears from the network, 
when it may instead shift its diet to other 
resources (75). Machine-learning opens ave- 
nues for anticipating how networks will re- 
wire in response to global change by identi- 
fying resources with combinations of traits 
that would make certain prey suitable alter- 
natives for predators. Similarly, knowledge 
of how a perturbation will alter the traits 
that underpin deep-learning algorithms 
would facilitate predictions of how the con- 
nections in the network, and not just the 
nodes, should respond. The accuracy and 
precision of these predictions may increase 
with the help of big data, transforming the 
power of network science for anticipating 
future ecological surprises from lofty aspi- 
ration to tangible reality. 
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Interpreting 


thoughts 
during sleep 


Rapid eye movements 
during sleep are a 

readout of thoughts during 
mouse dreams 


By Chris I. De Zeeuw'? and Cathrin B. Canto! 


ave you ever noticed someone's eyes 

move rapidly under their eyelids dur- 

ing sleep? Owing to technical chal- 

lenges in measuring eye movements 

in freely moving animals with their 

eyes half closed, this phenomenon 
has remained enigmatic. Sleep-related rapid 
eye movements occur during a specific sleep 
phase called rapid eye movement (REM) 
sleep, which is associated with vivid dreams 
(1). If rapid eye movements reflect thoughts 
during sleep, reading the eye movements of 
others, while observing them sleep, would 
open a window for reading and poten- 
tially manipulating their thoughts during 
dreams. On page 999 of this issue, Senzai and 
Scanziani (2) reveal that in mice, rapid eye 
movements during sleep are a readout of the 
internal sense of direction. 

During the awake state, head direction 
(HD) cells in the thalamus form a circuit that 
functions as a compass, and HD cell activity 
can be read out as the neuronal correlate for 
physical head movement (3). HD cells, to- 
gether with other cells that encode physical 
location in space, have been suggested to be 
important for the encoding of cognitive pro- 
cesses, such as the memorization of a certain 
episode in life, possibly linking spatial and 
temporal aspects of a particular event. Mice 
and other species, including humans, use a 
combination of alterations in heading and 
changes in eye movements to learn how to 
stabilize their gaze (4). In the awake situa- 
tion, mammals make occasionally fast eye 
movements, called saccades, with alternating 
periods of fixation. This “saccade and fixate” 
gaze pattern is closely linked to rotational 
head movements. When a subject moves the 
head, the semicircular canals of the inner 
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Senzai and Scanziani looked at whether 
REM sleep-related rapid eye movements 
encode daytime visual information by read- 
ing eye movements and concurrently decod- 
ing the internal representation of heading 
through simultaneously recorded HD cell 
activity. They found that the mouse brain 
processes visual and head movement infor- 
mation at the same time during REM sleep 
and that these information streams are con- 
gruent in that the direction encoded by the 
activity of the HD cells corresponds to that of 
the actual rapid eye movements. They found 
that REM sleep-related eye movements can 
be subdivided into two phases: “leading” eye 
movements, which predict the magnitude 
and directionality of the internal heading, 


ear sense three-dimensional head rotations, 
and the angular vestibulo-ocular reflex op- 
poses the rotational head movements to fix 
the gaze (5). In line with the evolutionary 
tight coupling between eye and head move- 
ments, Senzai and Scanziani found that the 
amplitude and speed of saccadic eye move- 
ments can predict physical head movements 
in awake and freely moving mice. At the 
same time, information from brain activity 
through recordings of the HD circuit allows 
predictions of the actual physical heading. 
Moving the eyes in the same direction as 
the head can be particularly advantageous 
when the environment is being scanned. For 
example, by combining head and eye move- 
ments, mice can quickly sample the sky for 


Dreaming of escape 
Senzai and Scanziani found that rapid eye movements of mice encode daytime visual information and 

represent the internal sense of direction during sleep. This cognitive processing might entrain behavior when 
mice are awake, such as avoiding a bird of prey. 


Vivid dream about 


Head ( escape from predator a 
and eye bird during REM sleep 4 
movement 


Asleep 


During subsequent sleep, the mouse goes through several 
cycles of non—-rapid eye movement (NREM) sleep to rapid 
eye movement (REM) sleep. Rapid eye movements during 
REM sleep reveal an intentional or cognitive process, such as 
dreaming about when the mouse escapes from the predator 
bird. During this phase, the optimal avoidance behavior is 
also reflected in the activity of head direction cells, which 
corresponds to the direction of the rapid eye movements. 


Awake 


As a prey animal, 

a mouse uses its 
converging head 
and eye movements 
to actively explore 
the world for 
potential threats. 


why \VAd 


predator birds (6, 7). Training such lifesav- 
ing actions during sleep might increase the 
chances for survival (see the figure). Yet, with 
the rapid eye movements during REM sleep, 
most muscles are paralyzed. Is it possible to 
train a goal-directed behavior largely inter- 
nally in the networks of the brain without 
making physical movements or making them 
only partially? Previously it was shown that 
brain activity that encodes places that have 
been experienced during daytime can reoc- 
cur at an accelerated rate during non-REM 
sleep and that this replay, as well as sleep in 
general, may facilitate memory formation 
and storage (8-10). Whether mice and other 
species use REM sleep-specific rapid eye 
movements for memorizing daytime visual 
information, such as detecting and escaping 
a predator, is an unsolved problem. 


and sleep-specific “follower” eye movements, 
which likely mediate recentering of the eye. 

This is fascinating because vestibular head 
movement information from the semicircu- 
lar canals is absent during sleep. At the same 
time, rapid eye movements indicate the on- 
going virtual representation of the HD circuit 
as if the mouse dreams about a particular 
physical head movement. This suggests that 
the rapid eye movements reflect cognitive 
processes that occur during sleep, such as 
manifesting the memory of a short episode 
in life that is critical for survival. 

What still needs to be determined is 
whether these eye movements and HD cir- 
cuit activity also come with a subconscious 
process that requires entrainment of a criti- 
cal readiness to act (17). It is also unclear why 
such dreams about movement do not mani- 
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fest in muscle activity during the non-REM 
sleep phase. It will be important to find out to 
what extent activity in the HD system during 
REM sleep is internally generated or driven 
by a common “stimulus” that activates both 
the HD cells and the eye muscle system, akin 
to the coupling mechanism that occurs dur- 
ing the vestibulo-ocular reflex. Given that the 
speed of replay of HD cells during REM sleep 
is similar to that of the awake state and thus 
substantially lower than that of non-REM 
sleep, it is parsimonious to hypothesize that 
peripheral inputs differentially impinge on an 
internally organized network during differ- 
ent arousal states (12). Investigating whether 
cognitive processes that involve head and eye 
movement depend on sleep, and specifically 
REM sleep, will clarify whether the recorded 
rapid eye movements are not just the result 
of random brain waves leading to coupled 
HD circuit activity but also indeed fulfill a 
functional role that is similar to that sug- 
gested for neuronal replay (73). 

Interpreting rapid eye movements during 
sleep may open opportunities for improving, 
interfering with, and/or deleting unwanted 
memories by inhibiting or stimulating eye 
movements during the dream phase. Perhaps 
vestibular stimulation with a certain fre- 
quency and amplitude could be used to im- 
prove memory for such movements while 
sleeping. Muscle twitches, which also fre- 
quently occur during REM sleep, might be 
related to the internal heading cues provided 
by rapid eye movements, and analyses of 
these might give further information about 
dreams. During the early development of 
mice, these muscle twitches have been shown 
to be important for healthy brain function 
(14), but whether the twitches that occur dur- 
ing REM sleep in adulthood also form a read- 
out of dreams remains unknown. Indeed, the 
findings by Senzai and Scanziani may make 
dreams come true through studies of eye 
movements of sleeping mice. 
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OPTICAL INTERACTIONS 


Two nanoparticles dancing as a pair 


Lasers induce and control interactions between two nanoparticles 


By Julen Simon Pedernales 


hat light can move matter should 

not come as a surprise to anyone. 

Scientists have considered this since 

1619, when Johannes Kepler sug- 

gested that the tails of comets are 

pointing away from the Sun because 
of the force exerted on them by the sun- 
light. By the 1970s, scientists had figured 
out how to use laser beams to push, pull, 
and trap nano- and microscale objects. 
Using the same physical principles, light 
can also induce interactions between par- 
ticles that would not otherwise “sense” the 
presence of each other. On page 987 of this 
issue, Rieser et al. (1) report an experimen- 
tal demonstration of such a light-induced 
interaction between two silica nanoparti- 
cles suspended in a vacuum and show that 
the trapping laser beams can be used as a 
control to tune the form and strength of 
the coupling. 

A dielectric nanoparticle embedded in a 
laser field with a wavelength much larger 
than the size of the particle can be treated 
as a point-sized object in calculations. The 
electric field from the laser induces an in- 
homogeneous distribution of charges in 
the nanoparticle, which can be considered 
as a point-like electric dipole that oscillates 
together with the laser field. If the electric 
field is not uniform, the dipole will experi- 
ence a force pushing it toward where the 
electric field is more intense. Because a la- 
ser beam is more intense at its focal point, 
a nanoparticle is attracted to this point 
and can be held in place against gravity. 
This is the working principle behind opti- 
cal tweezers (2). 

Consider a second laser beam travelling 
parallel to the first beam and containing its 
own nanoparticle. The two nanoparticles 
suspended in the two laser beams can then 
interact through electric dipole-dipole in- 
teraction. This results in what is known 
as the optical binding force (3), which 
only occurs when the particles are illumi- 
nated. Although the force is called a bind- 
ing force, it can be attractive or repulsive 
and can vary in magnitude. This depends 
on the relative orientation and phase of the 
dipoles, as well as the distance between the 
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dipoles, all of which can be controlled by 
tuning the laser beams. 

Rieser et al. take advantage of this de- 
pendence and show that by adjusting 
certain parameters of the trapping laser 
fields, it is possible to control how the two 
particles interact, thus turning the trap- 
ping laser beams into control handles for 
adjusting the interparticle coupling. This 
opens the door to engineering arrays of 


A one-way force on 
two nanoparticles 
Levitated nanoparticles scatter light in phase with 
their trapping laser fields. The interference 
between the waves radiated by each nanoparticle 
can result in the total scattered light 
having a preferential direction of propagation. 
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levitated solids with tunable couplings for 
the study of many-body physics in regimes 
otherwise inaccessible using previously ex- 
isting methods. 

Notably, the force that the dipoles exert 
on each other does not need to be recipro- 
cal. At first sight, this might seem to vio- 
late Newton’s third law of the equivalence 
between action and reaction. Nevertheless, 
one can untangle this paradox using the 
following logic. The optical binding is 
mediated by the electromagnetic waves 
from the laser that, after impinging on one 
nanoparticle, are scattered toward the sec- 
ond nanoparticle. However, the radiation 
scattered by each of the two nanoparticles 
will also interfere. For certain configu- 
rations of the particle positions and the 
laser phases, the combined light will in- 
crease in intensity in one direction while 
decreasing, or even canceling out, in the 
opposite direction (see the figure). This re- 
sults in a preferential direction for the to- 
tal light scattered from the incoming laser 
beams. Because light carries momentum, 
this deflection of light implies a change 
in momentum of the incident laser field, 
which needs to be compensated by the 
two nanoparticles by acquiring momen- 
tum in the opposite direction. Therefore, 
the two-particle system experiences a force 
acting on its center of mass in addition to 
the forces acting on the particles by each 
other, which accounts for the apparent 
nonreciprocity. 

With an exquisite degree of control, 
Reiser et al. could discriminate the contri- 
bution of such a nonreciprocal component 
in the total optical binding force. Such a 
technical ability enhances the type of in- 
teractions that can be engineered among 
quantum systems. The phenomenon of 
optical binding between optically trapped 
particles is not new. It was reported soon 
after the discovery of optical tweezers (4) 
and has since been reported numerous 
times for different setups, including parti- 
cles levitated in a vacuum (5, 6). However, 
in their experiment, Rieser et al. go beyond 
the observation of the phenomenon and 
reach an unprecedented degree of control 
over the interaction. In this way, they have 
turned the knowledge into a tool for ma- 
nipulating levitated optomechanical sys- 
tems. Gaining increased control over the 
dynamics of mesoscopic objects can help 
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the exploration of macroscopic quantum 
mechanics. 

Although quantum mechanics is often 
regarded as a theory of the microscopic 
world, its postulates do not indicate any 
particular scale at which they should cease 
to play a dominant role. Quantum proper- 
ties like the superposition principle have 
thus far been observed for systems of up 
to some thousands of atoms (7). Thus, the 
range of applicability of quantum mechan- 
ics remains a matter of scientific debate. 
The answer to this puzzle should ulti- 
mately come from an experiment, and the 
field of levitodynamics (8) suggests that 
nanoparticles levitated in vacuum might 
be the right platform for testing this un- 
seen boundary of where the quantum 
“rules” apparently become weak (9-11). 

Future research efforts may look to com- 
bine the tunable optical binding demon- 
strated by Rieser et al. with the already es- 
tablished ground-state cooling of a levitated 
nanoparticle (12). In doing so, this type of 
experiment would usher in the regime of 
quantum coherent interactions between 
levitated nanoparticles, opening the door 
for the generation of entanglement between 
them (13, 14). One can imagine the possibili- 
ties offered by a programmable, many-body 
platform, where particles are solids with 
rotational degrees of freedom and masses 
that allow them to interact gravitationally 
with each other and with the environment. 
In just one decade, the field of levitated op- 
tomechanics has gone from being proposed 
as a theory for the cooling of a levitated 
nanoparticle to its experimental demon- 
stration. Rieser et al. added a technique to 
the toolbox of this nascent field and may 
be regarded as a milestone on the path to- 
ward realizing the once niche theoretical 
endeavor in applications. 
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Ancient genomes and West 
Eurasian history 


Storytelling with ancient DNA reveals challenges 
and potential for writing new histories 


By Benjamin S. Arbuckle and Zoe Schwandt 


nnovations in the sequencing of ancient 

(>100 years old) DNA have provided a 

new source of historical information 

that is complementary to ancient texts, 

oral traditions, and the archaeologi- 

cal record. The geographic application 
of this new technology has been uneven, 
focusing largely on Western and Northern 
Europe. On pages 939, 940, and 982 of this 
issue, Lazaridis et al. (1-3) report leverag- 
ing newly sequenced ancient DNA from the 
remains of 777 humans from across West 
Eurasia. They describe the genomic history 
of the “Southern Arc,” the region surround- 
ing the Black Sea and including Southeastern 
and Eastern Europe, the Anatolian penin- 
sula, the Pontic steppe, and the Caucasus. 
Focusing on the relative contributions of 
five ancestral West Eurasian populations, 
they reveal shifting proportions of these 
lineages, which are used to reconstruct 
population movements extending from 
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the Neolithic (10,000 BCE) to the Ottoman 
(~1700 CE) periods and representing the 
demographic foundations of the modern 
West Eurasian world. 

Once thought to represent a revolution 
emanating from a single “hearth” in the 
Euphrates river valley, Lazaridis et al. ana- 
lyze ancient genomes from five populations 
(ancient Anatolian and Levantine farmers 
and Eastern, Balkan, and Caucasus hunter- 
gatherers) and show that the origins of 
farming in the Neolithic instead involved 
complex networks of communication and 
reproduction that connected populations 
in Anatolia, the Levant, Mesopotamia, and 
Iran (/—3). They also focus their powerful 
analytical tools on genomes associated 
with the expansionary Yamnaya archaeo- 
logical culture emanating from the Pontic 
steppe in the Bronze Age (third millen- 
nium BCE) (4). Associating the Yamnaya 
with distinctive Eastern hunter-gatherer 
ancestry and the spread of Indo-European 
languages, Lazaridis et al. explore details 
of the expansion of steppe populations into 
Southeastern Europe and Armenia, two 
regions with ancient Indo-European lan- 
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The Crater of Warriors terracotta vase from Mycenae, 
an archaeological site in Greece, is dated to ~1200 BCE. 
It depicts Mycenaean warriors on the march. 


guages (7-3). In Southeastern Europe, they 
show that steppe ancestry is associated 
with some but not all individuals from high- 
status tombs in Mycenaean Greece, in- 
dicating complex cultural (and _biologi- 
cal) dynamics between the local Minoan 
population and incoming steppe migrants. 
They also use genomic evidence to evalu- 
ate events described in ancient texts. 
They highlight the variable patterns of 
ancient Greek colonization, discover that 
Anatolian migrants transformed the dem- 
ographic composition of Imperial Rome, 
and show that ancient DNA can identify 
the expansion of Slavic and Turkic speak- 
ers into Eastern Europe and Anatolia in 
the medieval period (~500 to 1100 CE). 

Although Lazaridis et al. address an ex- 
traordinarily wide range of topics and pro- 
vide insights into the Eurasian past (J—3), 
several issues common to ancient DNA 
research are evident in the framing of the 
data, especially with regard to the stories 
that are chosen (or not) for explication. In 
ancient genome research, DNA sequences 
are often presented as revealing a “true” 
history of humanity in contrast to historical 
and archaeological records that are prone to 
untruthfulness and imprecision. Although 
base pairs do not lie or exaggerate (though 
they do decay), neither do they tell stories, 
and storytelling that is used to interpret an- 
cient genome analyses inevitably projects 
specific worldviews (5). 

Many of the narratives explored in 
the studies of Lazaridis et al. reflect a 
Eurocentric worldview. For example, the 
naming of the Southern Arc conjures a map 
projection that centers on the western tip 
of Eurasia rather than the Anatolian pen- 
insula—a more intuitive geographic center 
of the research area. Moreover, in terms of 
scale, narratives based on genomes often 
project a high-altitude view of history (6), 
mostly devoid of individuals despite being 
derived from its most personal components. 
Neolithic farmer or steppe Yamnaya genetic 
material moves abstractly on its way from 
central Anatolia to the Balkans or from the 
Don to the Danube and the Peloponnese. 

This approach to history-making offers a 
sanitized version of the past that avoids en- 
gaging with bodily experiences, including 
sexual violence, which was probably involved 
in the movement of genetic material through 
time and space (7). Sexual reproduction itself 
is reduced to a process motivated by compe- 
tition and survival and carried out primarily 
by men. Thus, with this approach, history is 
made through vague processes of migration 
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and admixture, but the social mechanisms 
remain uncharted (8). 

In constructing the history of the 
Southern Arc, Lazaridis et al. focus on Y 
chromosome lineages, especially to link 
populations in Southeastern Europe, the 
Aegean, and the southern Caucasus to the 
Bronze Age Yamnaya through shared pat- 
rilines (J—3). Ostensibly, there are techni- 
cal reasons for this—Y chromosomes allow 
for precise reconstructions of lineages and 
divergence times. However, the resulting 
emphasis on patrilineal descent and the ab- 
sence of discussions of parallel networks of 
matrilines (or of XX chromosome humans 
at all) creates a strong sense that the events 
of history are carried forward by “great 
men’—especially those bearing Eastern 
hunter-gatherer ancestry and buried under 
mounds of earth and stone. This emphasis 
on Y chromosome networks inadvertently 
projects gender stereotypes into the past, 
perpetuating an androcentric narrative of 
dominance and competition that equates 
chromosomes to gender and gender to be- 
havior. Conversely, approaches that explore 
maternal markers and sex-neutral kinship 
coefficients have recently been used, show- 
ing that alternate methods that overcome 
sex biases are possible (9, 10). 

Lazaridis et al. also present a dataset that 
estimates the phenotype, in terms of hair, 
eye, and skin pigmentation, for humans in 
the Southern Arc and Europe over the past 
15,000 years (]-3). They show that brown 
hair and eyes and “intermediate” skin pig- 
mentation was the most common pheno- 
type in the region through time and that, 
despite common stereotypes, Bronze Age 
steppe populations were not dominated by 
blonde and blue-eyed individuals. They also 
document an increase in “light” pigmenta- 
tion over time in West Eurasia, although po- 
tential reasons for selection of these traits 
are not addressed. 

Despite the authors’ intention of dispel- 
ling stereotypes, this brief presentation of 
phenotypes instead amplifies a Eurocentric 
worldview that highlights light pigmenta- 
tion (5). Genomics scholars should think 
carefully about the political ramifications 
of presenting sensitive data such as these 
because they are consumed by a wide au- 
dience, and some may repurpose them (17). 
Other ways of presenting the data are pos- 
sible. For example, it is equally interesting 
that southern Mesopotamians sometimes 
called themselves the “black-headed peo- 
ple,’ perhaps to contrast their own pheno- 
type with the abundance of brown hair in 
adjacent regions (72). 

Lazaridis et al. have produced an as- 
tounding dataset, unimaginable in its scale 
just a decade ago (J—3). It is important to 


acknowledge that the narrative scaffold- 
ing used to interpret these data inevitably 
represents worldviews that center certain 
people and places (13). Other narratives and 
scales of inquiry are also possible. For exam- 
ple, one of the most interesting alternative 
interpretations emerging from the studies 
of Lazaridis et al. is the resilience of hunter- 
gatherer genomes in regions outside the 
Fertile Crescent region of Southwest Asia, 
where farming first emerged. In particular, 
and despite documenting large-scale incur- 
sions of exogenous populations into the 
Balkans in the Neolithic and Bronze Age, 
the authors show the surprising resilience 
of local Balkan hunter-gatherer ancestry in 
the region that continues to the present day. 
This finding complicates previous narra- 
tives of linear population replacements and 
further highlights the need to explore the 
complex social interactions associated with 
ancient population movements. 

Moving forward, the growing corpus of 
ancient genomic data will continue to trans- 
form views of human history. This work can 
be particularly effective if researchers rec- 
ognize their lack of neutrality and embrace 
their role in constructing narratives while 
allowing room for diverse perspectives that 
shine light onto people and places whose 
histories are less well known. Lazaridis et 
al. do an excellent job of this, for example, 
by exploring the genomic diversity of the 
poorly known kingdom of Biainili (known 
as Urartu to their Assyrian neighbors) in the 
mountains of Transcaucasia (J-3). Rather 
than presenting history from a traditional 
top-down view, genomic scholars can lever- 
age the high resolution of their data to pivot 
between different scales—from the conti- 
nental to the individual—and can address 
historical networks of interaction involving 
both patrilines and matrilines. The studies 
by Lazaridis et al. represent an important 
milestone for ancient genomic research, 
providing a rich dataset and diverse ob- 
servations that will drive the next iteration 
of interpretations of the human history of 
West Eurasia. 
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LIGHT TRAPPING 


Absorbing light using time-reversed lasers 


Laser cavities can be reverse engineered to create an efficient light trap 


By Jacopo Bertolotti 


n 2019, researchers from the Mas- 
sachussetts Institute of Technology 
made headlines when they created the 
“blackest” material to date, which had 
the ability to absorb 99.995% of inci- 
dent light (J). More than aesthetics, 
there are many technologies that can ben- 
efit from maximizing light absorption—for 
example, in photovoltaics because of the 
need to absorb and convert as much light 
as possible into electricity, or on the inte- 
rior surface of a light sensor because of the 
need to minimize unwanted stray light. 
Although there are many ways to create 
something that can absorb some light, the 
endeavor gets more and more difficult the 
closer it gets to 100% absorption. On page 
995 of this issue, Slobodkin e¢ al. (2) report 
on a design principle that can absorb light 
on the basis of “coherent perfect absorp- 
tion” (3), which can theoretically absorb 
100% of the light incident on the device. 
Intuitively, one might assume that a ma- 
terial with a larger absorption coefficient 
would lead to it being a better light ab- 
sorber, but this is not always the case. For 
example, a sharp change in the value of the 
absorption coefficient will lead to the re- 
flection of a large fraction of the incident 
light, which sends the light away instead 
of letting it be absorbed. This is why many 
metals make for very good mirrors despite 
having a decent absorption coefficient (4). 
One possible solution to this conundrum 
of absorption versus reflection is to use a 
material with a low absorption coefficient. 
This allows light to enter the medium 
without being reflected away, and conse- 
quently, the light remains in the medium 
for a long time and is gradually absorbed. 
This phenomenon can be observed in the 
darkness at the bottom of the ocean, where 
most of the sunlight has been absorbed by 
water, which itself is nearly transparent. 
However, the obvious practical problem 
with this approach is the amount of space 
it requires, which limits its usefulness for 
most applications. A well-established ap- 
proach to obtaining a similar absorption 
in a smaller volume is to trap the light in- 
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side the medium. This is achieved in those 
“ultrablack” paints by using diffusive mate- 
rials, in which the light is scattered many 
times and thus takes a long time to exit the 
medium (1). Although this approach works 
well to remove unwanted light, many appli- 
cations aim to absorb light for use as en- 
ergy (such as a solar panel). It can thus be 
difficult to integrate, for example, the pho- 
tovoltaic material with the black paint to 
improve the overall efficiency of the device. 

One way to help a functional absorber, 
such as a photovoltaic cell, to absorb light 
is to create an environment that can trap 
light for the absorber. This is the purpose 
of the cavity used by Slobodkin et al. Elec- 
trodynamic phenomena are invariant un- 
der time reversal, which means that if a 
certain electrodynamic effect exists, then 
there should also exist a time-reversed 
equivalent (5). Because the time-reversed 
equivalent of light emission is light ab- 


sorption, any system that emits light can, 
in principle, also be used to absorb light. 
In particular, lasers use a cavity to trap 
the light around a light emitter to control 
and amplify it. Replacing an emitter with 
an absorber (even a poor one) will make 
the system operate in reverse and absorb 
the light very efficiently (3). However, the 
problem with this design approach is that 
a laser cavity traps only specific patterns of 
light (modes), and thus time reversal can 
be used only to absorb light that happens 
to be in one of the specific modes. Any 
other mode, such as beams coming at a 
different angle or having a different shape, 
will not be stable in the cavity and will not 
be absorbed as much. What is needed is a 
cavity that traps all possible modes. 
Slobodkin et al. designed a system that 
can trap all modes of light by using so- 
called “degenerate” optical cavities (6). An 
optical cavity is said to be “degenerate” 


As amplification, so absorption 


Inside a degenerate optical cavity, light rays are reflected and refracted so that they always follow the same 
trajectory, which keeps all light modes circulating in the cavity. Although this design was originally devised to 
amplify light in a laser, it can be adapted for light absorption. f, focal length of lens. 
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Light is kept circulating in the cavity and is amplified at each passage by the gain medium. 
However, a small fraction of light escapes the cavity as the laser emission. 
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When the gain medium is swapped for an absorber, the trapped light will gradually be absorbed. 


This makes the degenerate cavity a universal light absorber. 
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when any ray of light will eventually re- 
trace its own path in the cavity. The simple 
design by Slobodkin ez al. uses two mirrors 
on the outside and two lenses on the in- 
side (see the figure). The light is trapped 
between the mirrors, and the addition of 
lenses helps guide the rays to always hit 
the same spot on the mirrors after each 
reflection, making the system degenerate. 
As a result, any light trapped between the 
two mirrors and the two lenses is kept cir- 
culating inside the cavity and absorbed at 
each reflection (7). 

Although the experimental implemen- 
tation of Slobodkin et al. is just a proof- 
of-principle device, it points to what can 
be done with this method in the future. 
Albeit only achieving around 95% absorp- 
tion in their demonstration, their design 
strategy—known as “massively degener- 
ated coherent perfect absorption’—can in 
principle absorb 100% of the incident light. 

Their proof-of-concept device is also 
surprisingly robust to imperfections in its 


“want system that emits 
light can, in principle, 
also be used to absorb light.” 


fabrication. This is somewhat unusual for 
a method that relies on wave interference 
because it is not uncommon for a misalign- 
ment of just a few tens of nanometers to 
destroy the desired effect. 

Although the exact design of the device 
may not be ready for immediate applica- 
tions, it provides the distinctive advantage 
of enhancing the absorption of any other 
device without modifying it, which could 
be used to improve the efficiency of pho- 
todetectors or photovoltaic units. The use 
of a cavity also opens the way to very so- 
phisticated manipulations of absorption. 
By exploiting the techniques developed 
for lasers, it should be possible to design 
cavities that only trap certain frequencies. 
This would allow frequency-dependent 
absorption or the use of different frequen- 
cies of light to accumulate different phase 
retardations to generate time-dependent 
absorption. 
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CORONAVIRUS 


Wildlife trade is likely the 
source of SARS-CoV-2 


Multiple transmissions from wildlife at a market in Wuhan 
probably led to SARS-CoV-2 emergence 


By Xiaowei Jiang and Ruoqi Wang 


Imost all pandemic viruses have zoo- 

notic origins, including severe acute 

respiratory syndrome _ coronavirus 

(SARS-CoV) and SARS-CoV-2 (7). Dur- 

ing SARS outbreaks between 2002 and 

2003, a live animal source of SARS-like 
viruses was identified at a market in Guang- 
dong, China, providing unequivocal under- 
standing of its zoonotic origin. Although the 
most probable reservoir animal for SARS- 
CoV-2 is Rhinolophus bats (2, 3), zoonotic 
spillovers likely involve an intermediate 
animal. Various SARS-CoV-2-susceptible in- 
termediate animals were sold at the Huanan 
Seafood Wholesale Market in Wuhan, such 
as raccoon dogs, foxes, and mink. But these 
were unavailable for testing, so direct evi- 
dence of an animal source is lacking. Thus, 
it remains unknown exactly how SARS-CoV-2 
emerged and led to the COVID-19 pandemic. 
On pages 951 and 960 of this issue, Worobey 
et al. (4) and Pekar et al. (5), respectively, pro- 
vide quantitative evidence that SARS-CoV-2 
emergence was likely caused by multiple zoo- 
notic transmissions due to wildlife trading at 
the Huanan Market. 

The search for the origin of SARS-CoV-2 
resulted in numerous discoveries of SARS- 
CoV-2-related viruses in bats and other 
susceptible animals, with implications for 
virus evolution. The most closely related bat 
coronaviruses to SARS-CoV-2, which were 
sampled in Laos, share an ancestor from 
~30 years ago (3). At the genomic level, two 
of these viruses (BANAL-103 and BANAL-52) 
harbor a nearly identical receptor binding 
motif (responsible for human cell entry) 
to SARS-CoV-2 (see the figure). However, 
30 years of evolution could have led to 
substantial mutational changes in the viral 
genome (2). Therefore, continual sampling 
of SARS-CoV-2-related viruses in bats and 
other susceptible animals in Southeast Asia 
and China may still help characterize the 
evolution and origin of SARS-CoV-2, even if 
the intermediate animals from the Huanan 
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Market that carry the direct ancestor of the 
SARS-CoV-2 strains isolated in COVID-19 pa- 
tients in Wuhan are no longer available. 

Even without the intermediate animal, 
likely because Huanan Market was cleared 
from 1 January 2020, it is still possible to 
learn how SARS-CoV-2 may have emerged. 
Available epidemiological, genomic, and hu- 
man demographic data from the location 
where human infections first emerged, which 
is not necessarily the place of virus evolution- 
ary origin, can be analyzed to understand the 
beginning of the pandemic. To test whether 
Huanan Market is the source of the COVID-19 
pandemic, Worobey et al. provide epidemio- 
logical evidence that the early cases were cen- 
tered around the market, not other places fre- 
quently visited by people in Wuhan. Moreover, 
subsequent human-to-human transmissions 
shifted from the market and its neighborhood 
to more populated areas of Wuhan, particu- 
larly those with susceptible elderly people. 

Was wildlife trading the origin of the early 
COVID-19 outbreak at Huanan Market? 
Worobey et al. provide evidence that there 
was trading of SARS-CoV-2-susceptible 
wildlife spanning several years immedi- 
ately before December 2019. This created 
opportunities for close, sustained contacts 
between these animals and humans at the 
market, laying the foundation for potential 
zoonotic spillover. Moreover, Worobey et al. 
find that the market stalls that sold suscep- 
tible wildlife species are spatially correlated 
with SARS-CoV-2-positive environmental 
samples. Some of these sampled objects 
were used to handle wildlife, such as a metal 
cage and carts (6). At the beginning of the 
pandemic, two viral lineages of SARS-CoV-2 
(termed S and L) were revealed from the viral 
genomes of early cases (7) and later termed A 
and B. These strains only differ by two muta- 
tions. However, how these lineages relate to 
an early zoonotic spillover was unclear owing 
to the lack of a direct ancestor virus for com- 
parison. Worobey et al. established epidemi- 
ological links of early cases of the two viral 
lineages A and B to the market, and these 
lineages were present in the positive environ- 
mental samples from the market (6). 

How could zoonotic transmission lead 
to the emergence of two viral lineages at 
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Wildlife farming and SARS-CoV-2-related viruses 

Major wildlife farming and sourcing provinces for the Huanan Seafood Wholesale Market in Wuhan are shown 
with molecular dating of known severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-related 
sarbecoviruses. The phylogenetic tree indicates the estimated time of divergence (2, 3). Provinces that supply 
wildlife to Huanan Market, according to the World Health Organization (WHO) SARS-CoV-2 origin report (8), 

are shown in dark gray. Hypothetical wildlife trading paths for lineage A/B SARS-CoV-2 ancestral viruses and 
other failed zoonotic transmissions at the Huanan market are indicated with red arrows. 
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Huanan Market? Pekar et al. hypothesize 
that these two strains result from two inde- 
pendent zoonotic transmissions. Zoonotic 
events are stochastic in nature, so a spill- 
over with successful onward transmission 
between humans normally involves a series 
of failed attempts by the virus, allowing it 
to establish sustained transmissions. For 
two independent zoonotic spillovers to be 
successful, sustained contacts and multi- 
ple zoonotic transmissions between people 
and the animals carrying SARS-CoV-2 at 
the Huanan Market would have been re- 
quired. By using simulations—involving a 
predicted closely related SARS-CoV-2-like 
virus, the viral phylogenetic trees, and epi- 
demiological data of early COVID-19 cases— 
Pekar et al. show that the two viral lineages 
can only be explained by two independent 
zoonotic transmissions. They determine 
that proposed intermediate forms of the two 
lineages with one mutation are sequencing 
artifacts. The likelihood that there are two 
independent lineages of the virus suggests 
that these could only come from the source 
animals and that the most recent common 
ancestors are in animals. 
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Further analysis suggests that transmis- 
sions from animals sold at the Huanan 
Market happened in a short period, likely 
between 1 and 2 months. The genetic diver- 
sity of the virus in the animals that led to the 
COVID-19 outbreak is likely low. This implies 
that the animals could be sourced from a local 
wildlife farm because multiple independent 
transmissions that would have resulted in an 
outbreak require multiple infected animals 
at the market during those months. These 
infected animals could also have come from 
provinces other than the local Hubei region, 
as implied in the World Health Organization 
SARS-CoV-2 origin report (8). But these are 
all speculations because the supply chain 
for these susceptible animals to the Huanan 
Market has not been investigated. 

The zoonotic origins of the SARS and 
COVID-19 pandemics are very similar. Both 
likely involved trading of susceptible live wild- 
life at local markets and people who work and 
live in and around these markets. Developing 
real-time outbreak surveillance systems that 
are better at predicting risks would need to 
consider the supply chains at the human-an- 


imal interface. For example, wildlife trading 


for fur, skin, and human food is usually sup- 
ported by different farm sizes for socio-eco- 
nomic reasons, such as small farms in rural 
areas in an effort to reduce poverty (9). Wild 
animals harbor various pathogens, including 
potentially pandemic-causing coronaviruses, 
and only a fraction of virus diversity is being 
sampled (JO, 11). Farming large numbers of 
wildlife inevitably provides opportunities for 
spillover events and so poses unprecedented 
threats to human health. 

When farmed wildlife population sizes are 
large [at the scale of tens of millions or even 
larger (9)] and the underlying infrastructure 
for zoonosis control is lacking, farmed wild 
animals become reservoirs for pathogen ge- 
netic diversity to accumulate (9, 12). The di- 
versity and scale of wildlife farming make zo- 
onosis control almost impractical. Spillovers 
are destined to happen, particularly when 
there are changing driver events, such as al- 
tered demand owing to meat or food short- 
ages or because of cost increases (9, 13). For 
example, since early 2022 there has been 
increased demand in Thailand for cheaper 
meat from crocodile, a widely farmed wild 
animal for its skin, during the African swine 
fever virus pandemic that resulted in high pig 
and pork prices (9, 14). Although applying ex- 
isting livestock regulations to wildlife farm- 
ing may minimize such risks, its effectiveness 
remains to be seen (15). When the science un- 
derlying multiple-host pathogen evolution- 
ary dynamics is still not fully understood (72), 
it is challenging to establish effective zoono- 
sis control infrastructure. Although Worobey 
et al. and Pekar et al. reveal the likely details 
of early zoonotic and epidemiological events 
that led to the COVID-19 pandemic, without 
knowing the exact animal origin of SARS- 
CoV-2 the threat posed by another similar 
virus from wildlife farming is looming. 
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RETROSPECTIVE 


James Lovelock (1919-2022) 


Father of Earth system science 


By Timothy M. Lenton 


ames E. (“Jim”) Lovelock died on 26 

July, his 103rd birthday. An indepen- 

dent scientist and prolific inventor, 

Jim transformed our view of Earth 

and the impact of humans upon it. 

His Gaia hypothesis revealed that 
the thin film of life, air, water, soil, and 
sediments at the planet’s surface is a re- 
markable, self-regulating system. Jim also 
showed how we are disrupting that system, 
discovering important trace gases with 
his own instruments—including ozone- 
depleting chlorofluorocarbons. He was the 
original Earth system scientist. 

Born in Letchworth, UK, in 1919 and 
raised in Brixton, Jim hated doing home- 
work but loved scientific books and na- 
ture walks with his father. While attending 
Birkbeck College at night, Jim learned his 
craft as an apprentice chemist for a con- 
sulting firm. War took him to Manchester 
University, where he earned his bachelor’s 
degree in chemistry in 1941. A conscien- 
tious objector, Jim then joined the National 
Institute for Medical Research at Mill Hill. 
He received a PhD in medicine from the 
London School of Hygiene and Tropical 
Medicine in 1948 and a DSc in biophysics 
from the University of London in 1959. In 
1961, an invitation to consult for NASA in- 
spired him to ditch tenure and spend his 
life as an independent scientist. 

Jim made deep and diverse contributions 
to science, with a passionate disregard for 
conventional disciplinary boundaries. At 
Mill Hill, he gained a reputation as a mas- 
ter inventor of precision instruments, most 
notably the exquisitely sensitive electron 
capture detector. He was a pioneer of cryo- 
biology, including the freezing and resusci- 
tation of small mammals, which gave him a 
keen sense of the resilience of life. 

Working for NASA at the Jet Propulsion 
Laboratory in 1965, Jim was tasked with de- 
tecting whether there was life on Mars. He 
reasoned that the presence of abundant life 
on any planet would show up as a remotely 
detectable disequilibrium in the chemistry of 
its atmosphere (a method still foundational 
to contemporary efforts to detect life on exo- 
planets). A predominance of atmospheric 
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carbon dioxide strongly suggested that Mars 
was lifeless. Looking at Earth’s atmosphere, 
Jim saw an extraordinarily improbable, yet 
remarkably stable, chemical cocktail cre- 
ated by life. He realized that life must play 
a role in regulating the composition of both 
the atmosphere and the climate. Later he 
teamed up with evolutionary biologist Lynn 
Margulis, who put microbiological flesh on 
the chemical bones of the hypothesis, which 
novelist William Golding named “Gaia.” 

In the early 1970s, Jim predicted that oce- 
anic life would make atmospheric gases that 
return essential elements to the land. Using 
his own instruments, he discovered the bio- 
genic gases methyl iodide and dimethy] sul- 
fide in the remote marine atmosphere. He 


also discovered chlorofluorocarbons every- 
where, providing critical evidence that they 
threatened the ozone layer. Subsequently, 
Jim realized that dimethyl] sulfide produced 
by marine algae oxidized to form cloud con- 
densation nuclei, that more small water 
droplets make clouds brighter, and that the 
resulting cooling of the surface would affect 
the algae producing dimethyl sulfide. Such 
linking of biology, chemistry, and physics in 
feedback loops gave us a new understand- 
ing of Earth as a dynamic system. 

Gaia provoked strong reactions, particu- 
larly after Jim’s popular first book on the 
subject in 1979. Evolutionary biologists ar- 
gued either that global regulation required 
consciousness or that natural selection 
could never produce it. In response, Jim in- 


vented “Daisyworld,’ a model parable that 
demonstrated how feedbacks involving life 
could give rise to automatic climate regu- 
lation at a planetary scale. It also showed 
that when regulation breaks down, it does 
so catastrophically. Daisyworld influenced 
a generation of climate modelers and in- 
formed Jim’s follow-up book, The Ages of 
Gaia, which gave a new view of Earth his- 
tory as a series of distinct regulatory re- 
gimes interspersed by periods of turmoil. 

In 1992, at age 18, I wrote to Jim to an- 
swer his call for “practitioners of planetary 
medicine.” He had the humility to invite me 
to visit his home and laboratory at Coombe 
Mill, where we walked the grounds at his 
signature breakneck speed. I marveled at 
his laboratory in a converted barn, which 
upstairs housed the homemade dilution 
chamber he used to calibrate his instru- 
ments. Jim became my mentor, unofficial 
supervisor, and close friend. His wicked 
sense of humor was always testing people’s 
limits. On one memorable visit to the lab, 
he opened an old ice cream tub to reveal 
an orange puttylike substance, asking 
me what I thought it was. “Plasticene?” I 
ventured. “Semtex!” he replied. Jim was 
regularly employed by the UK Ministry of 
Defence, in this case to improve methods 
of sniffing out explosives. He was proud to 
have worked with explosives like Semtex 
throughout his life and still be in posses- 
sion of his fingers, a tribute to his care in 
the lab, despite a professed loathing for 
“health and safety.” 

Jim had an amazing intuition for how 
things worked, often arriving at a working 
solution or invention without knowing how 
he got there. He was also an incredibly cre- 
ative thinker who could make connections 
that no one else saw. Although wonderfully 
generous to his friends, Jim took a dim view 
of humanity’s collective potential, writing, “I 
would sooner expect a goat to succeed as a 
gardener as expect humans to become stew- 
ards of the Earth.” After sparking the envi- 
ronmental movement, Jim warned of the 
existential risk of climate change in his book 
The Revenge of Gaia. In recognition of his 
services to global environmental science, he 
was made a Companion of Honour in 2003. 

The world has lost a genius and icono- 
clast of immense intellectual courage. 
Never afraid to lambast the establishment 
and challenge convention, Jim Lovelock 
transformed our view of the world, started 
the new field of Earth system science, and 
inspired generations of researchers. AS we 
are confronted by complexity and volatility, 
from the pandemic to climate extremes, we 
need his unique perspective now more than 
ever before. 
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Policy impacts of statistical 
uncertainty and privacy 


Funding formula reform may help address unequal impacts 
of uncertainty from data error and privacy protections 


By Ryan Steed, Terrance Liu,? Zhiwei Steven 
Wu,? Alessandro Acquisti* 


ifferential privacy (7) is an increas- 

ingly popular tool for preserving 

individuals’ privacy by adding statis- 

tical uncertainty when sharing sen- 

sitive data. Its introduction into US 

Census Bureau operations (2), how- 
ever, has been controversial. Scholars, poli- 
ticians, and activists have raised concerns 
about the integrity of census-guided demo- 
cratic processes, from redistricting to vot- 
ing rights. The debate raises important is- 
sues, yet most analyses of trade-offs around 
differential privacy overlook deeper uncer- 
tainties in census data (3). To illustrate, we 
examine how education policies that lever- 
age census data misallocate funding be- 
cause of statistical uncertainty, comparing 
the impacts of quantified data error and of 
a possible differentially private mechanism. 
We find that misallocations due to our dif- 
ferentially private mechanism occur on the 
margin of much larger misallocations due 
to existing data error that particularly dis- 
advantage marginalized groups. But, we 
also find that policy reforms can reduce the 
disparate impacts of both data error and 
privacy mechanisms. 

Differential privacy is the cornerstone 
of the Census Bureau’s updated disclosure 
avoidance system (DAS) (2). Designed to 
rigorously prevent reconstruction, reiden- 
tification, and other attacks on personal 
data, differential privacy formally guaran- 
tees that published statistics are not sensi- 
tive to the presence or absence of any in- 
dividual’s data by injecting transparently 
structured statistical uncertainty (noise) (J). 
But even before differential privacy is ap- 
plied, estimates from the decennial census, 
surveys such as the American Community 
Survey (ACS), and other Census Bureau 
data products used for critical policy deci- 
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sions already contain many kinds of statis- 
tical uncertainty, including sampling, mea- 
surement, and other kinds of nonsampling 
error (4). Some amount of those errors is 
quantified, but numerous forms of error 
are not (5), including some nonresponses, 
misreporting, collection errors, and even 
hidden distortions introduced by previous 
disclosure avoidance measures such as data 
swapping (6). If quantified and unquanti- 
fied errors alike are not acknowledged and 
accounted for, policies that rely on census 
data sources may not distribute the impacts 
of uncertainty equally. 

In 2021, the US federal government ap- 
propriated over $16.5 billion in Title I 
funds (including several special grants not 


“..Misallocations due to our 
differentially private mechanism 
occur on the margin of 
much larger misallocations due 
to existing data error...” 


analyzed here) to distribute to over 13,000 
local education agencies (LEAs)—typically 
school districts—using a formula that takes 
as input census estimates of the number of 
children and children in poverty. School 
districts qualify for Title I grants on the 
basis of the number or share of children in 
poverty (7). However, the formula does not 
account for deviations in the poverty es- 
timates that could cause misallocations— 
cases where the funding amount allocated 
to a school district differs from its entitle- 
ment in an imaginary (3), noise-free world. 

Researchers have recognized Title I as an 
important case study of policy-relevant pri- 
vacy-utility trade-offs (8), including misallo- 
cation after noise injection for differential 
privacy (9). We extend this work by compar- 
ing the policy impacts of noise injected for 
privacy to the impacts of existing statisti- 
cal uncertainty, contextualizing preliminary 


error analyses by Census Bureau scientists 
(2). Our results empirically investigate 
analytical predictions and proposals from 
previous work on statistical estimation and 
federal funding formulas (10-12). 

We focus specifically on the way Title 
I implicitly concentrates the negative im- 
pacts of statistical uncertainty on mar- 
ginalized groups. Weakening privacy 
protection will do little to help the most 
vulnerable—for these communities, par- 
ticipating in a census survey can be es- 
pecially risky, despite the benefits of vot- 
ing rights protection and school funding. 
Historically, abuse of census data facili- 
tated internment of Japanese Americans 
and other injustices (3). Today, a parent 
with a restrictive lease may not mention 
their children to a census worker because 
they fear being kicked out by their land- 
lord if their responses are reidentified (73). 


SIMULATING NOISE INTITLE [ALLOCATIONS 
Prior work on differential privacy in the 
context of Title I is purely analytical, ana- 
lyzes abstracted components of funding 
formulas, or focuses only on basic grants 
(8, 9). By contrast, we fully replicate the 
Title I provisions for allocating more than 
$11.6 billion in basic, targeted, and concen- 
tration grants using the same data sources 
and procedures as the Department of Edu- 
cation, which is responsible for calculating 
the official Title I grant amounts each year 
(7). We measure the impact of data and 
privacy deviations on the 2021 allocations 
to 13,190 LEAs across the United States. 
The primary data input is the Census 
Small Area Income and Poverty Estimates 
(SAIPE) from 2019—a table of counts of 
total population, children, and children in 
poverty in school districts from all 50 states 
(excluding Puerto Rico and other territo- 
ries) that incorporates weighted survey es- 
timates from the ACS [see supplementary 
materials (SM) section 2 for details]. 

In a given year, the SAIPE may vary due 
to several sources of error, including rela- 
tive error in the county-level estimate, error 
from other data sources used (e.g., tax data), 
and errors from raking and recombination 
methods used to convert county estimates 
to school district estimates (4). To simulate 
the effects of these “data deviations’—quan- 
tified data errors (14)—we generate alterna- 
tive poverty estimates for each school dis- 
trict from a normal distribution around the 
published estimate of children in poverty in 
that district from the 2019 SAIPE, following 
prior work and Census Bureau guidance (4) 
(SM section 2). 

We then add “privacy deviations”’—noise 
deliberately injected to achieve differential 
privacy. The Census Bureau has not yet an- 
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nounced any concrete plans for updated 
disclosure avoidance in the ACS, and the 
SAIPE currently does not inject noise for 
privacy on top of its inputs. To illustrate 
how privacy deviations might affect these 
and similar products, and to guide policy- 
makers as the Census Bureau develops 
new disclosure avoidance measures, we 
follow prior work (8, 9) in applying the 
Laplace mechanism, a commonly used 
noise-injection procedure that is provably 
differentially private (7). Our hypothetical 
mechanism does not include the complex 
postprocessing applied to the discrete 
Gaussian mechanism used in the decennial 
census; we only round negative numbers 
to zero (2). 

The strength of differential privacy (de- 
scribed by the parameter e) determines the 
magnitude of privacy deviations (lower e 
implies stronger privacy and generally more 
noise). € measures how much an individu- 


provision (20 U.S.C. §6332) limits funding 
losses to between 5 and 15% per year and 
the “state minimum” provision (20 U.S.C. 
§6333) sets a formulaic floor on the total 
amount received by each state. We treat 
the allocations generated without these 
provisions as the official formula-based 
“entitlements” for each district. Later, we 
compare these entitlements and the real 
allocations produced with these provi- 
sions. For each privacy setting, we com- 
pute the misallocation due to deviations by 
comparing the simulated allocations after 
deviations to the official entitlements. We 
repeat this procedure 1000 times, drawing 
new data and privacy deviations in each 
trial. Our metric of group-weighted misal- 
location describes the expected misalloca- 
tion borne by the average formula-eligible 
child in a given group nationwide, assum- 
ing that misallocation to a district is borne 
equally by all its eligible students. 


Expected lost entitlements due to data error and privacy protections 
Out of a total of $11.7 billion in Title | basic, concentration, and targeted grants in 2021, we show expected 
sum of lost entitlements over 1000 trials due to quantifiable data error alone (“data deviations”; blue), 

and with the addition of noise injected for privacy (“privacy deviations”; blue plus red). Noise is injected with 
the e-differentially private Laplace mechanism. The margins of error at 99% confidence are too small to be 
depicted—less than $4 million for all three bars. Note that for € = 1.0, the additional funding loss due to privacy 
deviations falls within the 90% margin of error for the impact of data deviations alone. 


noise injected for privacy 


Loss due to Additional loss due to Additional loss due to 
quantifiable noise injected for privacy 
data error (less noise, € = 1.0) (more noise, € = 0.1) 
$1.06 billion $771 thousand $50 million 

| | 
117 10 8 6 


I 2021 Total funding for Title | grants ($ billions) 


al’s decision to respond to a census survey 
increases their risk of unwanted disclosure. 
It is not yet clear whether or how privacy 
deviations would be added to a statistical 
product like the SAIPE in practice, and 
because the SAIPE incorporates weighted 
survey estimates from the ACS, its sensitiv- 
ity to changes in an individual’s response is 
unclear. Instead, we try several reasonable 
privacy settings to provide an upper bound 
on the magnitude of privacy deviations that 
might be added in practice (8) (SM section 
2). We focus on e = 0.1 and e = 1 (SM section 
7 additionally varies « from 0.001 to 10). 
Previous work on Title I (8) suggests « = 
2.52; many applications use similarly high 
values (2), whereas differential privacy ad- 
vocates often prefer € < 1. 

The Title I legislation includes two 
post-formula provisions to achieve sec- 
ondary policy goals. The “hold harmless” 
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SUBSTANTIAL MISALLOCATIONS 

Of the roughly $11.7 billion distributed 
nation-wide in 2021, districts in our simu- 
lation expect to lose a total of $1.06 billion 
(summing all losses in each simulation, 
then averaging summed losses across 1000 
simulations; SD = $0.04 billion) in entitle- 
ments to other districts due to the Title I 
formula’s handling of existing (before dif- 
ferential privacy) data deviations alone (see 
the first figure). The standard deviation in 
misallocation (computed by averaging over 
1000 trials) is about $835,000 (the average 
district receives around $880,000)—$237 
per student. When we add privacy devia- 
tions (for a relatively strong privacy setting 
€ = 0.1), the expected total entitlement loss 
only increases by $50 million (4.7%; mar- 
ginal SD = $2.9 million). For a less strong 
privacy setting (smaller privacy deviations; 
e = 1), the increase is negligible. The mar- 


ginal impact is small because—as in the 
2020 Decennial Census (2)—the magni- 
tude of privacy deviations is comparable to 
the magnitude of data deviations only in 
the least populous districts, even at a rela- 
tively strong privacy setting (e = 0.1) (SM 
section 7). 

These costs are geographically asym- 
metrical. Certain population-sparse school 
districts, especially in the Northwest, ben- 
efit greatly on average from data deviations 
(fig. S3a)—their small sample sizes induce 
proportionally larger data deviations, and, 
because of their low absolute numbers of 
children in poverty (though poverty rates 
may still be high), they have more room to 
gain funding than to lose funding. Then, be- 
cause the federal appropriation is fixed and 
allocations are zero sum, more populous 
districts, especially in the Southeast, pay for 
that proportional increase in funding with a 
small “tax” (9). Less populous districts gain 
even more as they qualify for new grants 
(10) (fig. S4). Notably, although less popu- 
lous, usually rural districts gain funding on 
average from data deviations, their alloca- 
tions are more volatile (9) (fig. S7). 

When we add privacy deviations (for 
relatively strong privacy, € = 0.1), gains by 
small districts are even more exaggerated 
(fig. S3b). Unlike data deviations, where 
the absolute variance increases with popu- 
lation size, our privacy deviations have the 
same variance in every district, exceeding 
data deviations in magnitude only in the 
least populous districts. Still, the marginal 
increase in cost to districts due to privacy 
deviations is much less than the base-level 
misallocations resulting from data devia- 
tions, and the marginal change reduces to- 
tal misallocation about half the time. 


DIVERSION FROM MARGINALIZED GROUPS 
Owing to Title I’s distribution of quantified 
data deviations alone, Black students and 
Asian students can expect to lose around 
$5 and $8 per eligible student, respectively, 
whereas white students gain over $2 per 
eligible child on average (see the second fig- 
ure). (The average district receives $1120 per 
eligible student.) Likewise, school districts 
with large Cuban, Puerto Rican, and other 
Hispanic communities expect to lose fund- 
ing (between $3 and $14 per eligible stu- 
dent), whereas non-Hispanic districts gain 
(fig. S9). For a child in a particular district 
in an unlucky year, the disparity may be 
worse. Whether a demographic group loses 
funding depends on whether its members 
tend to live in high- or low- poverty dis- 
tricts. Often, this happens because the pov- 
erty rate in the group itself is high. Groups 
that tend to live in denser, usually urban 
districts with more children in poverty lose 
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out, whereas groups that live in sparse, often 
rural districts with fewer children in poverty 
(though the rate of poverty may be higher) 
gain. Geographically concentrated groups— 
such as tribal nations or racial subgroups 
(SM section 4)—experience more volatility in 
outcomes across trials, which depend on the 
population density and poverty rates where 
they live. 

In a relatively strong privacy setting (€ 
= 0.1), our differential privacy mechanism 
aggravates these disparities, especially for 
Black students, who lose more than twice 
as much funding on average after noise 
is injected—possibly because Black stu- 
dents are more likely to attend populous 
school districts where the costs of privacy 
deviations accumulate. But in less strong 
privacy settings (« = 1), disparities change 


very little from the status quo when pri- 
vacy deviations are added (SM section 7). 
To assess the impacts on noncategori- 
cal demographics, we also fit a general- 
ized additive model (GAM) to the school 
district-level combined misallocations (e 
= 0.1) using district population character- 
istics: population density, median house- 
hold income, proportion white, propor- 
tion Hispanic, proportion renter-occupied 
housing, and racial homogeneity (the 
Herfindahl-Hirschman index). Fitting the 
GAM on a sample of 100 trials, we find 
that districts with a median income be- 
tween approximately $25,000 and $75,000 
(about 56% of districts) can expect to lose 
out because of deviations, whereas most 
other districts gain (fig. S6). The 40% most 
population-dense districts can also expect 


Expected misallocation by racial group 


Expected misallocation borne by the average formula-eligible child in a given census group nationwide is shown (assuming 
each child in a district is affected by misallocation equally). Specifically, bars depict the nationwide sum of each district's 
misallocation multiplied by the proportion of respondents of a given census single race category in that district, divided 

by the total nationwide number of eligible children of that race (SM section 2). Averaged over 1000 trials. The colored bars 
indicate the race-weighted misallocation due to data deviations (data error) alone, with an error bar spanning a 90% normal 
confidence interval for this quantity. The additional impact of privacy deviations is significantly different (P < 0.01) for all 


groups, according to a two-sample z-test. 
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to lose funding. Conversely, districts that 
are less than 5% Hispanic tend to benefit 
from data and privacy deviations. 


SIMPLE REFORMS 

Simple changes to the formula—including 
additional provisions currently required 
by law—can alleviate or aggravate dispari- 
ties. For example, adding the hold harmless 
provision reduces the standard deviation in 
misallocation (relative to the formula entitle- 
ment) but drastically increases disparities in 
outcomes for racial minorities (see the sec- 
ond figure). Hold harmless prevents small 
districts from losing funding to data or pri- 
vacy deviations, thereby increasing the tax on 
more populous districts and their non-white 
residents. The state minimum provision has 
a similar but smaller effect. Typically received 
by low population states, the 
state minimum slightly increases 
the amount of grants to low pop- 
ulation districts, exacerbating 
disparities. 

This result illustrates a ten- 
sion in evidence-based formula 
funding: Because estimates for 
less populous geographies have 
higher variance in both privacy 
and data deviations relative 
to their populations and en- 
titlements, measures that over- 
whelmingly benefit those small 
areas burden larger areas. 

We tested proposed policy 
changes that could alleviate this 
tension (SM section 6). We find 
that using multiyear averages 
with windows of increasing size 
decreases both overall misallo- 
cation and outcome disparities 
compared to when we use the 
averaged poverty estimates as a 
baseline (figs. S14 and S15). In 
general, using an average dimin- 
ishes both data deviations and 
the privacy deviations required 
to achieve differential privacy, 
limiting both increases in ex- 
pected funding for less popu- 
lous districts and alleviating 
worst-case outcomes. Averaging 
may even be just as effective at 
stabilizing funding year to year 
as the hold harmless provision 
(10). We also tested requiring 
repeated years of ineligibility 
before disqualifying districts 
from funding, which did not 
change overall misallocation— 
likely because it permits more 
marginally wealthy districts to 
175 receive funding—but did reduce 

disparities (figs. S14 and S15). 
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PAYING FOR (PRIVATE) DATA 

Simple policy changes can alleviate dis- 
parities in the impact of statistical un- 
certainty, but precisely targeted funding 
formulas will still have costs. Policy-mak- 
ers could ensure that no school district 
expects to lose money because of the un- 
derlying data deviations quantified in our 
simulation by assigning just $107 million 
(SD = $31 million) in targeted payments 
to individual districts that lose funding 
on average across 1000 simulations. The 
cost of stronger privacy (using our simpli- 
fied mechanism) could be much less: To 
compensate districts for only the expected 
additional lost funding due to privacy devi- 
ations, policy-makers need only distribute 
an extra $41 million (SD = $3.8 million) for 
stronger privacy (e = 0.1), or $1.7 million 
(SD = $601,000) for less added privacy (e 
= 1) (SM section 7). Still, a district’s actual 
loss in any given year often greatly exceeds 
its expected loss, especially for less popu- 
lous districts. To compensate districts for 
both data and privacy deviations in all but 
the worst 5% of our simulations, an ad- 
ditional $4.7 billion would be needed in 
the stronger privacy setting (€ = 0.1). The 
cost is greater if policy-makers wish to also 
compensate for the many other forms of 
error not quantified here, or for a stronger 
privacy mechanism. 

It may be difficult to justify or legislate 
funding increases to just the districts ex- 
pected to lose funding. Simply increasing 
the total federal appropriation to Title I 
(benefiting all districts unequally) by $135 
million (the combined total expected loss) 
would only compensate for about half of 
expected losses. However, a $4.7 billion in- 
crease (95% loss coverage) would compen- 
sate for nearly all total expected losses and 
cut total 5% quantile misallocation roughly 
in half. The White House’s proposed 2022 
allocation—a $20 billion increase, since 
reduced to $1 billion in Congress—would 
completely compensate for privacy and 
data deviations incurred under the 2019 
budget, but inequalities would remain. 
An overall budget increase would provide 
“no-penalty” compensation (9) for data 
and privacy deviations but would not solve 
issues of relative equity (though budget 
increases do reduce the number of held 
harmless districts). 


UNCERTAINTY-AWARE POLICY DESIGN 

The addition of noise for differential pri- 
vacy exposes epistemic issues with for- 
mula design predicted by early work on 
census-guided federal funding even before 
differential privacy was first proposed (3, 
10-12). Indeed, our results suggest that the 
impacts of differential privacy relative to 
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other sources of error in census data could 
be minimal. But current legislation holds 
few allowances for the impacts of statisti- 
cal uncertainty. Use of census data for the 
Title I formula is mandated “unless the 
Secretary and the Secretary of Commerce 
determine that some or all of those data 
are unreliable or ... otherwise inappropri- 
ate” (20 U.S.C. §6333). National Research 
Council studies, commissioned by the De- 
partment of Education before ACS esti- 
mates were first incorporated in the SAIPE 
after 2005, warned against hard thresholds 
and hold harmless provisions (71, 12)—but 
these provisions are still in effect. Recently, 
the Biden administration proposed a new 
Title I budget that includes funding to im- 
prove the poverty estimates—but there are 
still no measures to update the formula to 


“policy reforms can reduce 
the...impact of both data error 
and privacy mechanisms.” 


handle uncertain inputs. Simply acknowl- 
edging the effects of data error could im- 
prove future policy design for both formula 
funding and disclosure avoidance. 

Our findings come with limitations. 
Injected noise is just the tip of the iceberg: 
Many other unquantified forms of statisti- 
cal uncertainty—including previous dis- 
closure avoidance methods—affect poverty 
estimates in different ways (5). No confi- 
dentiality measures are directly applied to 
the SAIPE, but its inputs (mainly ACS and 
IRS data) may have hidden or unintended 
distortions due to swapping and other ad 
hoc disclosure avoidance techniques (6). 
By replacing other methods of disclosure 
avoidance, differential privacy could even 
reduce the amount of overall misallocation 
due to uncertainty. Lacking an alternative 
source of poverty data, we do not assess 
the impacts of systematic biases, including 
undercounts of marginalized groups. Our 
analysis of the Title I allocation process also 
leaves out several elements that could af- 
fect the applicability of our findings to the 
real-world distribution of funds, including 
small-district appeals (20 U.S.C. §6333) and 
district-level heterogeneity in use of funds. 
Temporal trends in funding, in combination 
with provisions like hold harmless, could 
compound the effects of deviations (J0). 

Data error—from undercounts to sam- 
pling error to noise injection—will always 
affect evidence-based policy to some de- 
gree. In 2017, 316 federal spending pro- 
grams relied on US census data to distrib- 
ute over $1.5 trillion in federal funding 
across states, cities, and school districts 


(15). Uncertainty in census data—including 
intentionally added error for privacy—will 
incur costs for stakeholders in those pro- 
grams. But at least the quantifiable por- 
tion of those costs can be mitigated with 
uncertainty-aware policy design and bud- 
get increases—an avenue for compromise 
between targeted policy, equity, and also 
additional privacy. 
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The new storytellers 


Computer-crafted tales offer insight into human creativity 


By Stephanie Dick 


torytelling is an essential feature of 
the human condition, propose Mike 
Sharples and Rafael Pérez y Pérez in 
Story Machines. It allows us to make 
meaning in the world and in our lives, 
to communicate with one another, to 
teach, to learn, and to explore. The book, 
which provides a readable, engaging, and 
instructive introduction to the mechanisms 
according to which computers have been 
made to produce “stories,” also confronts 
more fundamental questions: What consti- 
tutes our own creativity? Can stories do their 
cultural work without connecting a reader to 
a human writer? If computers cannot under- 
stand, appreciate, or intend the meaning of 
their compositions, does that limit their cre- 
ative potential and the work they might do? 
The landscape of artificial intelligence 
(AI) literature so often tends toward over- 
confidence, even hubris, but Sharples and 
Pérez y Pérez, refreshingly, do not claim 
to answer these questions in the book, al- 
though they remain at the heart of the proj- 
ect, nor do the pair pronounce what the 
future will or should be for computerized 
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writing. The authors are as interested in 
what can be learned about people as story- 
tellers and meaning-makers as they are in 
the automation of literary composition. 

The book begins with an engaging sur- 
vey of historical, cognitive, and cultural 
perspectives on human storytelling and a 
discussion of the long tradition of attempts 
to automate the production of 
compelling stories. Readers are 
then taken through a detailed ex- 
ploration of the main approaches 
to Al-driven story production, 
which doubles as an accessible 
introduction to different ap- 
proaches to AI in general. 

We meet researchers who have 
sought to identify a set of rules, or 
grammars, informed by both lin- 
guistics and the formal study of 
myths, fairy tales, and novels, that 
could be translated into computer programs. 
As with most other domains of AI research 
that seek to convincingly simulate complex 
human behavior according to a set of rules 
in this fashion, the stories produced by rule- 
bound generators tend to be disappointing. 

The authors then introduce more- 
advanced strategies, such as neural net- 
works, which are trained on massive da- 
tasets (in this case, of existing stories and 
prose). Such approaches—which empower 


Story Machines 
Mike Sharples and 
Rafael Pérez y Pérez 
Routledge, 2022. 
194 pp. 


Avisitor observes a robot writing a Bible in 
Lutherstadt Wittenberg, Germany, in 2017. 


algorithms to generate their own protocols 
on the basis of patterns and correlations 
they identify in the data, allowing for pre- 
dictions about how to construct effective 
sentences, plots, characters, and _ struc- 
tures—can produce far more compelling 
stories, but they often come at the cost of 
our access and understanding. 

For Sharples and Pérez y Pérez, the goal 
of automation is not just to find any means 
to produce compelling computer story gen- 
eration; they seek automated tools that can 
report on themselves, thereby facilitating 
a refined understanding of human creativ- 
ity. This, they believe, could allow for new 
forms of storytelling, creative collaboration, 
and readership. The authors praise meth- 
ods of story automation that are based on 
both active models of human creativity and 
the construction of what they call “story 
worlds,” which consist of places, characters, 
and constraints that a computer might ex- 
plore and tell stories within and about. 

One of the most valuable features of the 
book is its rich presentation of examples. 
Readers come away having read nearly 100 
instances of mechanically and computa- 
tionally generated stories, which provide a 
clear sense of the variety of approaches and 
the kind of story they produce. Readers are 
even invited at several moments to experi- 
ment with these methods themselves. The 
authors also integrate computer-generated 
prose into the book’s text, which serves as 
a frequent reminder that the act of reading 
may shift and transform as automation and 
authorship converge in different ways. 

The book largely frames auto- 
mated storytelling as a benign, 
creative, intellectual, and excit- 
ing field. However, it also hints 
at darker and deeply concerning 
possibilities. The authors men- 
tion believable computer genera- 
tion of fake news stories, targeted 
custom narrative production 
that could infiltrate our internet 
browsing, fabricated texts made 
to convincingly read like those 
written by specific people, and 
many other potentially exploitative and ex- 
tractive applications. Indeed, the most pow- 
erful tools for automated story production 
described in the book have been developed 
by entities such as Microsoft’s OpenAI and 
Facebook, which have a reputation for pri- 
oritizing power, control, and profit over all 
else. The main drawback of the book is that 
it attends only in passing, and almost hesi- 
tantly, to these realities. @ 

10.1126/science.adc9237 
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When utility providers fall short 


Pacific Gas and Electric’s abysmal track record highlights 
the company’s failings and our own 


By Gretchen Bakke 


acific Gas and Electric (PG&E) is 

no one’s favorite utility provider. 

Failures of the company’s electrical 

equipment have been implicated in 

fire after fire in Northern California. 

Its gas network, meanwhile, caused 
an explosion that destroyed a full block of 
a suburban San Francisco neighborhood in 
2010. PG&E is perhaps best known, how- 
ever, for its role in the fire that wiped the 
town of Paradise, California, from the map 
in 2018. In addition to causing tremendous 
property damage, that fire killed 85 peo- 
ple, the vast majority of whom were found 
dead in their homes or in cars, trapped by 
the fast-moving, wind-whipped flames that 
engulfed the valley. 

Katherine Blunt starts her copiously 
researched account of PG&E’s checkered 
history here, at the moment a century-old 
hook on a remote transmission line failed, 
dropping the line to the ground and ig- 
niting what would become known as the 
Camp Fire. The deadliest fire in Califor- 
nia history, the Camp Fire was neither the 
first nor the last to be sparked by PG&E, a 
utility company with such a poor track re- 
cord that only a careful excavation such as 
Blunt’s can chart the thin course between 
wrongdoing, simple incompetence, poor 
governance, and bad luck. 

PG&E—like every other utility provider— 
has never existed in a vacuum; it has been 
swayed by American cultural values from 
the start. Blunt introduces readers to these 
values via a long list of mustachioed and 
variously quaffed characters across the 
centuries—men whose visions for the com- 
pany (or against it) pushed and pulled 
PG&E through California as it grew from 
a gold-rush frontier to the drought-riven, 
climate-concerned global economic actor 
it is today. It was a rollicking century to 
be sure, one about which PG&E has man- 
aged to lose a truly momentous number of 
technical records while becoming mired in 
even more scandals than Blunt details (J). 

PG&E’s service territory covers about 
70,000 square miles, home to some of the 
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richest people in America and to some of 
the country’s wildest remaining lands—the 
Sierras, the Pacific coast, grasslands, and 
(today) thousands of acres of bone-dry 
forests. By one estimation, to effectively 
control wildfires, PG&E would not only 
need to upgrade all its rural electrical in- 
frastructure but also regularly trim more 
than 8 million trees “in strike 
distance” of its lines. This work 
is both expensive and thankless; 
trees never stop growing, and a 
lot of the company’s infrastruc- 
ture is very old. 

Should such a company— 
especially one whose equipment 
sparks as a part of its normal 
operations, because all electri- 
cal systems do spark and short 
and fail occasionally—even ex- 
ist in the tinderbox that is Northern Cali- 
fornia today? This question stands at the 
heart of Blunt’s book and plagues PG&E’s 
current leadership. And it is masked by 
another: Could any company provide such 
services while also meeting expectations 
for uninterrupted energy supply, deliver- 
ing promised returns to investors, and re- 
paying fire and gas explosion victims, all 
while using renewables such as wind and 
solar to produce more than 30% of its elec- 
tricity? Seemingly not, except that both of 
California’s other investor-owned utilities, 


California Burning 
Katherine Blunt 
Portfolio, 2022. 368 pp. 


Southern California Edison and San Diego 
Gas and Electric, have a much better, if still 
imperfect, track record. 

Blunt argues that PG&E is a failed com- 
pany that nevertheless remains a major 
energy provider because no one—not the 
courts, not regulators, not legislators, not 
consultants, not the company’s serially re- 
placed CEOs, and not fire victims or their 
lawyers—has come up with a better way 
to provide reliable electricity and gas to 
4.8 million residential customers. As PG&E 
roils from bankruptcy proceeding to bank- 
ruptcy proceeding, as it is convicted (twice) 
of willfully disregarding safety regulations 
that have led to the deaths of a hundred 
people and destroyed the homes and live- 
lihoods of thousands of others, 
Blunt asks an even bigger ques- 
tion: If a corporation is a person 
in the eyes of the law, what does 
the death penalty look like for 
such an entity? 

Blunt’s story is not only a tale 
of the mostly undeserved sec- 
ond chances granted to PG&E 
but also of the failings of Cali- 
fornia’s politicians, regulators, 
and judiciary to mandate a new 
route forward that prioritizes the safe and 
reliable provision of energy. There are bet- 
ter solutions available than the ones PG&E 
is offering. That we continue turning to 
PG&E again and again to provide what the 
company has proven it cannot is a failing 
that falls on us. & 
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Avendor sells crabs in an open-air market in Washington, DC. 


PODCAST 


The Blue Revolution: Hunting, 
Harvesting, and Farming Seafood 
in the Information Age 
Nicholas P. Sullivan 

Island Press, 2022. 272 pp. 


Large-scale commercial fishing has 
depleted fish stocks around the world, 
but new technologies are increas- 
ingly being deployed to improve the 
efficiency and sustainability of this 
enterprise. This week on the Science 
podcast, Nicholas Sullivan unpacks 
the current state of fishing, explores 
the promise of aquaculture and 
fish farming, and discusses the future 
of seafood. 


https://bit.ly/3Q8lcBq 
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Workers sort plastic waste at Yongin Recycling Center in Yongin, South Korea, during the COVID-19 pandemic. 


Edited by Jennifer Sills 


Paths to sustainable 
plastic waste recycling 


The outbreak of COVID-19 has driven 
increased use of medical personal protec- 
tive equipment, packaged take-out meals, 
and home-delivered groceries, exacerbating 
the accumulation of waste plastics (/, 2). 
The adoption of inappropriate management 
strategies such as local burning, incin- 
eration, and landfilling has also increased, 
leading to the leakage of waste plastics 

into the environment and hindering the 
mitigation of micro- and nanoplastics (3). 
Approximately 6% of the world’s annual oil 
production is devoted to plastics, and 850 
million metric tons of greenhouse gases was 
associated with new plastics production and 
incineration of waste plastics in 2019, which 
is equivalent to the annual emissions from 
189 500-megawatt coal-fired power plants 
(4, 5). In the face of these challenges, it is 
imperative to develop strategies to recycle 
waste plastics more sustainably. 

Some national and regional govern- 
ments are taking action toward this goal. 
The European Commission has marked 
plastic recycling as a key priority for the 
new Circular Economy Action Plan and is 
planning to introduce a range of quotas for 
minimum recycled content in new plastic 
products (6). Some states in the United 
States have proposed minimum recycled 
content mandates to end plastic pollution. 
For example, California has required ther- 
moform plastic containers to contain no 
less than 20% or 30% (depending on the 
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recycling rate) postconsumer recycled plastic 
by 2030 (7). These efforts will help mitigate 
plastic pollution. 

Coordinated global actions are also 
needed. In June, a United Nations 
Environment Assembly working group 
began planning for the development of an 
international legally binding instrument, 
which 175 nations committed to completing 
by the end of 2024, to shift away from single- 
use plastics and ensure the achievement of 
the UN Sustainable Development Goals (8). 
The treaty should include recycling stan- 
dards of practice and require commitments 
to implement them. 

Joint development efforts should work 
toward technologies that can recycle waste 
plastics [e.g., (9, 10)]. Designing catalytic 
processes to recover plastic monomers or 
valuable alkane products from waste plastic 
could potentially close the plastic use loop 
and bring a new and profitable branch 
to the plastic recycling industry (1, 12). 
Collective efforts to recycle plastic waste 
could substantially contribute to achiev- 
ing an economy with net-zero greenhouse 
gas emissions by 2050 and limiting global 
warming to less than 1.5°C by 2100. 
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News story miscasts 
Alzheimer’s science 


As experienced researchers in the 
Alzheimer’s disease field, we have serious 
concerns about the News story “Blots on 
a field?” (C. Piller, 22 July, https://scim. 
ag/3TlevDh). The article summarizes 

the apparent manipulation of data ina 
2006 study (7) and suggests that coau- 
thor Sylvain Lesné committed egregious 
scientific misconduct when preparing 
figures supporting a central role for the 
specific amyloid-8 (AB) peptide, AB*56, 
in memory loss in mouse models of 
Alzheimer’s disease. Such investigative 
reporting is important to correct potential 
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fraud. However, the article overstates and 
distorts the effect of this paper on the 
Alzheimer’s field. 

Lesné’s alleged malfeasance does 
not threaten the “reigning theory” of 
Alzheimer’s pathobiology, known as 
the “amyloid hypothesis.” The tone and 
content of the News story and the sensa- 
tionalized media coverage it generated 
could derail Alzheimer’s drug discovery 
and development based on the amyloid 
hypothesis, which is supported by a vast 
body of well-conducted research. 

By 2008, 2 years after the Lesné paper, 
Alzheimer’s researchers determined that 
the findings supporting AB*56’s role in 
cognitive decline could not be replicated. 
Several, including one of us (D.S.), tried 
(2, 3). Although unaware of malfeasance, 
the field policed itself by not confirming 
these findings. Rather than being a “key 
element” of today’s theory supporting the 
association between AB accumulation and 
cognitive decline, Lesné’s “star suspect” 
rapidly became history as the science of 
Af oligomers moved ahead. Simply put, 
the 2006 Lesné et al. study had minimal, 
if any, impact on the evolution of the amy- 
loid hypothesis. 

Today, approximately 16% of Alzheimer’s 
drugs in development are predicated on 
the amyloid hypothesis. Only a few of these 
directly target AB oligomers, and none 
targets AB*56 specifically. No patients have 
been misled and entered trials based on the 
A*56 observation. Targeting AB-associated 
neuronal dysfunction represents one of the 
field’s most promising approaches to prevent 
and treat Alzheimer’s disease. 
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Alzheimer’s target still 
viable but untested 


In his News story “Blots on a field?” (22 
July, https://scim.ag/3TlevDh), C. Piller 
inaccurately describes my scientific the- 
sis. He implied that my work has mis- 
led researchers in the Alzheimer’s disease 
field by encouraging the development 

of therapies targeting amyloid plaques, 
which are composed of amyloid beta pro- 
tein (Af). Rather, for more than 20 years, 
I have consistently expressed concerns 
that drugs targeting plaques were not 
likely to be effective. 

My published work independent of 
Lesné (J, 2) indicates that there are 
two classes of toxic AB oligomers based 
on quaternary protein structure: type 1 
and type 2. One particular form of type 
1 (referred to in our papers as AB*56) is 
found both within and outside amyloid 
plaques and was shown by my lab and 
several others to impair memory function 
in mice (3-5) and to be associated with 
memory loss in humans with probable 
Alzheimer’s disease (6). The type 2 form 
is found encased within amyloid plaques, 
and thus entrapped it has a limited effect 
on cognition. It is this form (type 2) that 
drug developers have repeatedly but 
unsuccessfully targeted. I am unaware of 
any clinical trials targeting the type 1 form 
of A8—the form that my research has sug- 
gested is more relevant to dementia. Piller 
seems to conflate these two forms of toxic 
oligomers. 

Piller’s story suggests that the pursuit 
of AB-targeted therapies for Alzheimer’s 
disease—which I agree has been frus- 
tratingly negative and expensive—was 
somehow ignited or fueled by the 2006 
Nature paper (7). This is not true. Decades 
of human genetics research and studies 
of mouse models from many labs had 
already led drug developers to conclude 
that AB was a plausible target. 

In the story, Piller confounds two distinct 
issues: the frustrations regarding the dif- 
ficulties of drug development in Alzheimer’s 
disease and a specific allegation of scientific 
misconduct relating to a set of papers about 
one particular subset of type 1 AB oligo- 
mers. To imply that the latter bears a heavy 
burden of responsibility for the former, as 
Piller did, condemns potentially fruitful 
avenues of the Af hypothesis. 

Having worked for decades to under- 
stand the cause of Alzheimer’s disease so 
that better treatments can be found for 
patients, it is devastating to discover that 
a co-worker may have misled dozens of 
coauthors, including me, and the scientific 


community through the doctoring of 
images. However, Piller’s implication that 
this development invalidates my work is 
untrue and unfair. 
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Editor's note 


We thank Selkoe and Cummings for their 
supportive words about our investigative 
reporting, but we disagree with their conten- 
tion that the suspect work by Sylvain Lesné 
had minimal impact. The 2006 Nature paper 
discussed in the Letter—only one of nearly 
two dozen papers now in doubt—was cited 
at least 90 times annually as recently as 
2021, according to Springer Nature. Over 

the past 5 years, Cummings and Selkoe 
continued to cite it and other work by Lesné 
on the amyloid-8 (AB) AB*56 molecule. The 
story’s assertion that the suspect work may 
have misdirected other Alzheimer’s research 
was not ours alone; multiple experts in 

the field told us that it jump-started the 
search for soluble forms of amyloid-f that 
might explain neuron death and cognitive 
decline in Alzheimer’s. Those efforts have 
not yielded any effective therapies to date. 
As for the broader amyloid hypothesis, to 
which both authors have devoted decades of 
effort, it does not depend on Lesné’s work, 
and our story quotes Selkoe to that effect. 
But the doubts about work that once seemed 
a milestone in the effort to link amyloid to 
cognitive decline in Alzheimer’s are surely a 
setback for the field. 

Ashe misstates what we said about the 
influence of her work with Lesné. The idea 
that Alzheimer’s might be treated by target- 
ing amyloid deposits predated their 2006 
paper by many years, as the story explains. 
But that and multiple other suspect papers 
did fuel the search for specific toxic forms 
of amyloid and for drugs that could target 
them, the line of research Ashe continues 
to pursue. She acknowledges the concerns 
about image manipulation that our story 
raises but downplays their importance and 
assumes no responsibility as Lesné’s mentor 
and senior author of dubious studies. 


Tim Appenzeller 
News Editor 
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MATERIALS SCIENCE 
An irregular plan 


Edited by 
Michael Funk 


aterials with irregular microstructures are common in the natural world and often have 
interesting properties. Liu et al. devised a growth-inspired program for generating 
irregular materials from a limited number of basic elements. Using building blocks with 
arbitrary complexity, the authors stochastically connected them subject to a set of local 
rules. The results echoed the diversity of natural systems with a large range of func- 
tional properties. —BG Science, abn1459, this issue p. 975 


Artist's conception of an illuminated, three-dimensional building block adding to a growing microstructure 
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Carbon-based filter 
capacitors 


Filter capacitors are used 

in filter circuits to convert 
alternating current into direct 
current by smoothing out 
ripples in the incoming supply. 
They have been dominated by 
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aluminum electrolytic capaci- 
tors and are typically the largest 
component in an electronic 
circuit because of their low areal 
capacitances, thus limiting the 
potential for miniaturization. 
Using three-dimensional porous 
anodic aluminum oxide (AAO) 
templates, Han et al. con- 
structed a network of carbon 
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tubes in which they deposited 
nickel catalyst nanoparticles 
and grew vertically aligned car- 
bon nanotubes using chemical 
vapor deposition. After removal 
of the AAO, a flexible film was 
obtained. The films show a 25% 
improvement in areal capaci- 
tance at 120 hertz and can be 
connected in series without 


affecting their electrochemical 
performance. —MSL 
Science, abh4380, this issue, p. 1004 


AUTOIMMUNITY 
Livers beware T and 
B cells! 


Diets high in fat and sugar are 
known to contribute to chronic 
inflammation in the liver and 
subsequent autoimmune reac- 
tions, yet the immune responses 
behind this phenomenon are not 
well characterized. Studying mice 
given a high-fat, high-glucose diet, 
Clement et al. identified PDIA3, a 
protein involved in immunogenic 
cell death, as a peptide presented 
by cell surface proteins that 
led to pathogenic T and B cell 
responses. PDIA3-specific T cells 
and PDIA3-specific antibodies 
were sufficient to induce liver 
toxicity in mice. Anti-PDIA3 was 
also detected at high levels in the 
serum of patients with chronic 
inflammatory liver conditions, 
suggesting that this mechanism 
may also be operative in humans. 
—DAE 

Sci. Immunol. 7, eabl3795 (2022). 


MOLECULAR BIOLOGY 
Minimally invasive cancer 
classification 


Molecular characterization of 
hormone-responsive cancers 
by mapping tumor-specific 
transcription factor—binding 
locations holds promise for track- 
ing the progression of cancers. 
However, invasive biopsies and 
labor-intensive chromatin immu- 
noprecipitation and sequencing 
(ChIP-seq) techniques limit the 
reach of these approaches. Rao 
et al. developed a minimally 
invasive methodology that relies 
on sequencing the cell-free DNA 
circulating in plasma to generate 
an accurate map of transcrip- 
tion factor binding in estrogen 
receptor—positive breast cancer. 
The authors successfully used 
this mapping method to classify 
tumors associated with differ- 
ent types of estrogen receptor 
expression and mortality. —-JHD 
Sci. Adv. 10.1126/ 
sciadv.abm4358 (2022). 
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The meaning of rapid 
eye movement 


Sleep includes phases charac- 
terized by rapid eye movement 
(REM) that are known to be 
associated with dreaming. But 
are these eye movements related 
to the contents of consciousness 
in that sleep state? Senzai and 
Scanziani recorded head direc- 
tion cells in the anterior dorsal 
nucleus of the thalamus in mice 
during wake and sleep (see the 
Perspective by De Zeeuw and 
Canto). The direction and ampli- 
tude of eye movements encoded 
the direction and amplitude of the 
heading of mice in their virtual 
environment during REM sleep. 
It was possible to predict the 
actual heading in the real and 
virtual world of the mice during 
wake and REM sleep, respectively, 
using saccadic eye movements. 
—PRS 

Science, abp8852, this issue p. 999; 

see also add8592, p. 919 


Another twist 

Metasurfaces are specially 
designed arrays of dielectric 
components that transform the 
function of bulk optical compo- 
nents into thin films. Exploiting 
the physics of bulk states in the 
continuum for the highly efficient 
trapping of light, Santiago-Cruz 
et al. demonstrate metasur- 
faces that operate as sources 

of quantum and chiral light, 
respectively. Patterned in gallium 
arsenide, the quantum source 
can provide entangled pairs of 


Scanning electron microscope 
image of a metasurface with an array 
of symmetry-breaking resonators 
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photons across a broad range 
of wavelengths, allowing for the 
formation of complex quantum 
states. The authors also used a 
dielectric metasurface doped 
with emitting molecules to pro- 
duce chiral light and lasing. Both 
approaches will be useful for the 
development of integrated opti- 
cal and quantum optical devices. 
—|SO 

Science, abq8684, this issue, p. 991 


Depauperate webs 
Over the past 50 years, 60% of 
animal populations have been 
pushed to extinction. Although 
already tragic, such losses also 
have profound impacts on the 
ecological integrity of biologi- 
cal systems. Fricke et al. looked 
across mammalian commu- 
nities globally over the past 
130,000 years and found that 
more than half of the links, or 
connections, within these com- 
munities have been lost (see 
the Perspective by O'Gorman). 
This loss is due to extinction of 
species but also to a reduction 
in the ranges of extant species, 
because the total numbers of 
individuals within a species have 
also declined. Such losses could 
have profound impact on the 
long-term persistence and func- 
tion of ecosystems. —SNV 
Science, abn4012, this issue p. 1008; 
see also add7563, p. 918 


Designer chromosomes 
One of the goals in synthetic 
biology is to generate complex 
multicellular life with designed 
DNA sequences. Being able to 
manipulate DNA at large scales, 
including at the chromosome 
level, is an important step toward 
this goal. So far, chromosome- 
level genetic engineering has 
been accomplished only in 
haploid yeast. By applying gene 
editing to haploid embryonic 
stem cells, Wang et al. achieved 
whole-chromosome ligations in 
mice and successfully derived 
animals with 19 pairs of chro- 
mosomes, one pair fewer than is 
standard in this species. —DJ 
Science, abm1964, this issue p. 967 
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Perisperm piperine biosynthesis 
he pungent taste of black pepper is due in part to 
1-piperoy! piperidine (piperine) in the dried fruits of the 
Piper nigrum plant. Piperine quantity increases as the 
fruit matures. Jackel et al. show that the final step in 
piperine biosynthesis takes place in the perisperm, which 
supports the embryos of the black pepper fruit and is where 
the relevant enzymes and their products are colocalized. The 
aromatic flavor compound accumulates in specialized stor- 
age cells of the maturing fruit. If not harvested for our dinner 
tables, more than 95% of piperine remains in the dried 
seed shell as the young seedling germinates, itself mostly 
lacking piperine. —PJH Plant J. 111, 731 (2022). 


Black pepper fruits accumulate the aromatic compound piperine 


as they mature. 


Children protect against 
severe COVID-19 


During the current COVID-19 
outbreak, children are less likely 
than adults to develop severe 
disease, and the causes for this 
are obscure. One hypothesis 

is that early in life, children are 
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exposed to other coronavi- 
ruses, which strengthens their 
immunity to this virus family. 
Solomon et al. show that adults 
with children are also protected. 
The authors analyzed data from 
millions of patients in a health 
care system and discovered that 
although adults with children 
have a higher risk of getting 
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Good traits for travel 


pecies’ abilities to establish in new locations 
depend on traits such as body size, life history, 
and current distribution. Among reptiles, spe- 
cies that reproduce early and often (i.e., those 
with a “fast” life history strategy) are more likely 
to become invasive. Weil et al. investigated whether 
life history strategy and other traits can explain 
the spread of chameleons, a group of lizards that 
originated in Africa. Chameleons have dispersed 
through multiple events to India, Arabia, and Europe. 
Phylogenetic analyses showed that both fast and 
slow life histories can be advantageous: Species with 
extreme life histories were more likely to have moved 
between regions. Large-bodied and coastal species 
were also more likely to have spread, especially when 
these traits were combined with extreme life histo- 
ries. —BEL Ecography 10.1111/ecog.06323 (2022). 


Chameleons dispersed out of Africa to Arabia, Europe, and India. 


COVID, they are less likely to 
develop serious disease. The 
authors could not obtain direct 
evidence of preexisting immunity 
to coronaviruses, but suggest 
that parents are likely exposed 
to endemic coronaviruses when 
their children are young, provid- 
ing them with partial immunity 
against COVID-19. —YN 

Proc. Natl. Acad. Sci. U.S.A. 119, 


e2204141119 (2022). 
AGING 
Mitigating mitochondrial 
defects 


Mitochondrial DNA mutations 
accumulate during aging, but 
the underlying mechanisms 
remain obscure. Eukaryotic cells 
contain a mixed complement 

of mitochondria, some of which 
may contain deleterious muta- 
tions. In the fruit fly, Drosophila, 
Tsai et al. found that quality 
control mechanisms such as 
purifying selection ensure that 
deleterious mutations are at 

a competitive disadvantage. 
However, with age, purifying 
selection becomes ineffective. 
Enhancing quality control, either 
genetically to enhance purifying 
selection or pharmacologically 
using kinetin, mitigated the age- 
associated increase of mutant 
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genomes, improved fly vigor, 
and reduced neural degenera- 
tion. Similar pharmacological 
treatment of aged wild-type 
mice reversed age-associated 
accumulation of a mitochon- 
drial DNA deletion mutation. 
Thus, future therapies designed 
to modify the progression of 
mitochondrial disease could 
allay age-associated degenera- 
tion. —SMH 

Proc. Natl. Acad. Sci. U.S.A. 119, 

e2119009119 (2022). 


SCIENCE EDUCATION 
ACURE for 
museum exhibits 


Course-based undergraduate 
research experiences (CUREs) 
are becoming abundant in 
undergraduate education. 
Donegan et al. explored what a 
CURE would look like outside of 
a classroom. Working with a uni- 
versity museum, and centered 
on the topic of desert biodi- 
versity, students were given 
autonomy and access to the 
museum's biodiversity collec- 
tions to identify a researchable 
question that could be explored 
using relevant data collection 
tools and analytical approaches. 
The experience culminated 

in the development of the 
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museum exhibit rather than a 
final exam. Students reported 
positive shifts in engaging in the 
implementation and evaluation 
of museum exhibits and on the 
relevancy of scientific work and 
museum education to broad 
audiences, highlighting the 
value of museums as a novel 
avenue for expanding upon the 
broader relevance of CUREs. 
—MMc 
J. Biol. Educ. 10.1080/ 
00219266.2022.2103168 (2022). 


BIOMATERIALS 
Optimizing the 
environment for healing 


The delivery of growth factors 
can enhance chronic wound 
repair, but challenges remain 
in optimizing growth factor 
activity over time. Hwang et al. 
developed a hydrogel made of 
hyaluronic acid and collagen 
that incorporated electro- 
statically complexed DNA 

and polyethylenimine poly- 
plexes using collagen mimetic 
peptide—based tethers. The 
authors looked at how changes 
in the fraction of collagen 
mimetic peptides enhanced the 
retention of the polyplexes to 
enable growth factor delivery 
over many days and how the 


incorporation of hyaluronic acid 
in the hydrogel created a local- 
ized environment that enhanced 
gene transfection efficiency. 
—MSL 


Acta Biomater. 10.1016/ 
j.-actbio.2022.07.039 (2022). 


ORGANIC SYNTHESIS 
Anaerobic oxidation 
of alkenes 


Carbonyl compounds can 
be synthesized from alkenes 
through aerobic routes such as 
ozonolysis or with a combina- 
tion of osmium tetroxide and 
sodium periodate. Control of 
stoichiometry, the formation of 
waste products, and the need 
for highly oxidizing condi- 
tions limit the scope of alkene 
reactants. Wise et al. report an 
anaerobic route in which the 
visible light photoexcitation 
of a nitroarene (4-cyano- 
nitrobenzene) generates pairs 
of oxygen radicals that undergo 
nonspecific cycloaddition with 
the double bonds. A wide variety 
of alkenes bearing functional 
groups that would not tolerate 
oxidizing conditions formed 
carbonyl compounds through 
this reaction. —PDS 

J.Am. Chem. Soc. 10.1021/ 

jacs.2c05648 (2022). 
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Connecting genes 
and history 


Stories about the peopling—and 
people—of Southern Europe 
and West Asia have been passed 
down for thousands of years, and 
these stories have contributed 
to our historical understand- 
ing of populations. Genomic 
data provide the opportunity to 
truly understand these patterns 
independently from written his- 
tory. Ina trio of papers, Lazaridis 
et al. examined more than 700 
ancient genomes from across 
this region, called the Southern 
Arc, spanning 11,000 years from 
the earliest farming cultures 
to post-Medieval times (see 
the Perspective by Arbuckle 
and Schwandt). On the basis 
of these results, the authors 
suggest that earlier reliance on 
modern phenotypes and ancient 
writings and artistic depictions 
provided an inaccurate picture of 
early Indo-Europeans, and they 
provide a revised history of the 
complex migrations and popula- 
tion integrations that shaped 
these cultures. —SNV 
Science, abm4247, abqO755, 
abq0762, this issue p. 939, 940, 982; 
see also add9059, p. 922 


CORONAVIRUS 


Pandemic epicenter 

As 2019 turned into 2020, a 
coronavirus spilled over from 
wild animals into people, spark- 
ing what has become one of the 
best-documented pandemics 
to afflict humans. However, 

the origins of the pandemic in 
December 2019 are controver- 
sial. Worobey et al. amassed 

a variety of evidence from the 
City of Wuhan, China, where 
the first human infections were 
reported. These reports confirm 
that most of the earliest human 
cases centered around the 
Huanan Seafood Wholesale 
Market. Within the market, the 
data statistically located the 
earliest human cases to one 
section where vendors of live 
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wild animals congregated and 
where virus-positive environ- 
mental samples concentrated. 
Ina related report, Pekar et al. 
found that genomic diversity 
before February 2020 comprised 
two distinct viral lineages, A 
and B, which were the result of 
at least two separate cross- 
species transmission events into 
humans (see the Perspective by 
Jiang and Wang). The pre- 
cise events surrounding virus 
spillover will always be clouded, 
but all of the circumstantial 
evidence so far points to more 
than one zoonotic event occur- 
ring in Huanan market in Wuhan, 
China, likely during November— 
December 2019. —CA 
Science, abp8337, abp8715, 
this issue p. 951, p. 960; 
see also add8384, p.925 


NANOPHYSICS 
Levitated interactions 


The ability to trap macroscopic 
objects in vacuum, levitating 
them with optical fields, and 
cooling them to their motional 
ground state, provides access 
to highly sensitive sensors for 
applications in metrology. Rieser 
et al. demonstrate the trapping 
of two silica nanoparticles and 
explore the light-induced dipole- 
dipole interactions between 
them (see the Perspective by 
Pedernales). The results provide 
a route to developing a fully 
tunable and scalable platform 
to study entanglement and 
topological quantum matter with 
nanoscale objects. —ISO 

Science, abp9941, this issue p. 987; 

see also add1374, p.921 


OPTICS 


Perfecting absorption 

The absorption of light is 
important in many natural 
processes and device applica- 
tions. Although absorbing a little 
bit of light is easy, absorbing 

all of it is generally difficult to 
achieve. However, the process 
of coherent perfect absorp- 
tion, an interferometric effect, 


showed that light of a very 
specific spatial mode could 
be perfectly absorbed if it 
was matched precisely to the 
properties of a cavity. Slobodkin 
et al. now extend the concept 
to multimode coherent perfect 
absorption (see the Perspective 
by Bertolotti). Using a self-imag- 
ing cavity design, the authors 
demonstrate that complex 
spatial modes can be absorbed 
almost perfectly. —ISO 

Science, abq8103, this issue p. 995; 

see also add3039, p. 924 


ALZHEIMER’S DISEASE 
Determining behavior 
by following tau 


Although some of the symptoms 
of Alzheimer’s disease are con- 
served among patients, multiple 
phenotypes exist, including 
amnestic, visuospatial, language, 
and behavioral/dysexecutive 
symptoms. Understanding the 
mechanisms mediating the 
different behavioral phenotypes 
of Alzheimer’s disease will help 
in the development of specific 
treatments. Therriault et al. stud- 
ied the pattern of tau pathology 
spreading in multiple cohorts of 
patients with different pheno- 
types and showed that each 
phenotype was associated with 
a specific connectivity-based 
pattern of tau aggregation. The 
results suggest that intrinsic 
brain connectivity drives tau 
aggregation patterns, thus 
determining the behavioral phe- 
notype of the disease. -MM 

Sci. Transl. Med. 14, eabc8693 (2022). 


NEUROSCIENCE 
Spiraling into 
neurodegeneration 


Activating mutations in the 
kinase LRRK2 can cause 
Parkinson's disease (PD). 
Mutant LRRK2 promotes the 
processing of amyloid precursor 
protein (APP) into its transcrip- 
tionally active form, the APP 
intracellular domain. Zhang et 
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al. linked these proteins ina 
feed-forward cycle. In cultured 
neurons and brain tissue from 
mouse models of PD, LRRK2- 
mediated phosphorylation 
of APP up-regulated the APP 
intracellular domain, which 
then directly mediated LRRK2 
transcription. Pathological 
markers in LRRK2-mutant PD 
models were reduced by the APP 
intracellular domain inhibitor 
itanapraced, which is currently 
in trials to treat Alzheimer’s 
disease. —LKF 

Sci. Signal. 15, eabk3411 (2022). 
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HUMAN GENETICS 


The genetic history of the Southern Arc: 
A bridge between West Asia and Europe 


losif Lazaridis, Songiil Alpaslan-Roodenberg et al. 


INTRODUCTION: For thousands of years, humans 
moved across the “Southern Arc,” the area 
bridging Europe through Anatolia with West 
Asia. We report ancient DNA data from 727 
individuals of this region over the past 11,000 
years, which we co-analyzed with the pub- 
lished archaeogenetic record to understand 
the origins of its people. We focused on the 
Chalcolithic and Bronze Ages about 7000 to 
3000 years ago, when Indo-European lan- 
guage speakers first appeared. 


RATIONALE: Genetic data are relevant for under- 
standing linguistic evolution because they can 
identify movement-driven opportunities for lan- 
guage spread. We investigated how the chang- 
ing ancestral landscape of the Southern Arc, as 
reflected in DNA, corresponds to the structure 
inferred by linguistics, which links Anatolian 
(e.g., Hittite and Luwian) and Indo-European 
(e.g., Greek, Armenian, Latin, and Sanskrit) 
languages as twin daughters of a Proto-Indo- 
Anatolian language. 
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ANATOLIAN NEOLITHIC 


Many partings, many meetings: How migration and admixture drove early 
language spread. Westward and northward migrations out of the West Asian 
highlands split the Proto-Indo-Anatolian language into Anatolian and Indo- 
European branches. Yamnaya pastoralists, formed on the steppe by a fusion of 
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Anatolia 


, 5 0, 2000-1000 BCE 
be 0. 


RESULTS: Steppe pastoralists of the Yamnaya 
culture initiated a chain of migrations linking 
Europe in the west to China and India in the 
East. Some people across the Balkans (about 
5000 to 4500 years ago) traced almost all their 
genes to this expansion. Steppe migrants soon 
admixed with locals, creating a tapestry of di- 
verse ancestry from which speakers of the Greek, 
Paleo-Balkan, and Albanian languages arose. 
The Yamnaya expansion also crossed the 
Caucasus, and by about 4000 years ago, 
Armenia had become an enclave of low but 
pervasive steppe ancestry in West Asia, where 
the patrilineal descendants of Yamnaya men, 
virtually extinct on the steppe, persisted. The 
Armenian language was born there, related to 
Indo-European languages of Europe such as 
Greek by their shared Yamnaya heritage. 
Neolithic Anatolians (in modern Turkey) were 
descended from both local hunter-gatherers 
and Eastern populations of the Caucasus, 
Mesopotamia, and the Levant. By about 6500 years 
ago and thereafter, Anatolians became more 


genetically homogeneous, a process driven 
by the flow of Eastern ancestry across the 
peninsula. Earlier forms of Anatolian and non- 
Indo-European languages such as Hattic and 
Hurrian were likely spoken by migrants and 
locals participating in this great mixture. 
Anatolia is remarkable for its lack of steppe 
ancestry down to the Bronze Age. The ancestry of 
the Yamnaya was, by contrast, only partly local; 
half of it was West Asian, from both the Caucasus 
and the more southern Anatolian-Levantine con- 
tinuum. Migration into the steppe started by 
about 7000 years ago, making the later expan- 
sion of the Yamnaya into the Caucasus a return 
to the homeland of about half their ancestors. 


CONCLUSION: All ancient Indo-European speak- 
ers can be traced back to the Yamnaya culture, 
whose southward expansions into the Southern 
Arc left a trace in the DNA of the Bronze Age 
people of the region. However, the link con- 
necting the Proto-Indo-European-speaking 
Yamnaya with the speakers of Anatolian lan- 
guages was in the highlands of West Asia, the 
ancestral region shared by both. 
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the full article online. 
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Genetic History 
of the Southern Arc 


1 

Around 7000-5000 years ago, 
people with ancestry from the 
Caucasus (blue) moved west into 
Anatolia and north into the steppe. 
Some of these migrants may 

have spoken ancestral forms 

of Anatolian and Indo-European 
languages. 


40S SSS SS 
Beginning ~5000 years ago, 
Yamnaya expansions introduced 
Eastern European ancestry (red) 
west into the Balkans and Greece 
and east across the Caucasus into 
Armenia. However, they made no 
detectable impact on Anatolia. 


CAUCASUS HUNTER-GATHERERS 


newcomers and locals, admixed again as they expanded far and wide, splitting 
the Proto-Indo-European language into its daughter languages across Eurasia. 

Border colors represent the ancestry and locations of five source populations 

before the migrations (arrows) and mixture (pie charts) documented here. 
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By sequencing 727 ancient individuals from the Southern Arc (Anatolia and its neighbors in Southeastern 
Europe and West Asia) over 10,000 years, we contextualize its Chalcolithic period and Bronze Age (about 
5000 to 1000 BCE), when extensive gene flow entangled it with the Eurasian steppe. Two streams 

of migration transmitted Caucasus and Anatolian/Levantine ancestry northward, and the Yamnaya 
pastoralists, formed on the steppe, then spread southward into the Balkans and across the Caucasus 
into Armenia, where they left numerous patrilineal descendants. Anatolia was transformed by intra—West 
Asian gene flow, with negligible impact of the later Yamnaya migrations. This contrasts with all other 
regions where Indo-European languages were spoken, suggesting that the homeland of the Indo- 
Anatolian language family was in West Asia, with only secondary dispersals of non-Anatolian Indo- 


Europeans from the steppe. 


he Balkans and Anatolia are often por- 

trayed as being geographically periphe- 

ral to Europe and Asia rather than as 

central to an interconnected region span- 

ning both continents. Here, we take a 
different view by providing a systematic gen- 
etic history of what we refer to as the “South- 
ern Arc,” a region (Fig. 1A) centered on the 
large Anatolian peninsula (Turkey), including 
in the west (in Europe) the Balkans and the 
Aegean, and in the south and east, Cyprus, 
Mesopotamia, the Levant, Armenia, Azerbaijan, 
and Iran. We present new genome-wide DNA 
data from 777 individuals from the Southern 
Arc: 727 previously unsampled and 50 previ- 
ously published for which we report new data 
from 1094 newly generated ancient DNA libra- 
ries (/). As a resource to guide future sampling 
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efforts, we also report negative results for 
476 samples that we screened using 537 li- 
braries and that failed to yield ancient DNA 
data meeting the criteria for authenticity (J). 
Finally, we provide 239 new radiocarbon dates 
on the same skeletal elements analyzed for 
DNA (1). We studied these along with the 
previously published individuals for a total 
sample size of 1317 ancient individuals in the 
region (Fig. 1B) (7). 

Our newly reported data fill many sampling 
gaps in space and time in the Southern Arc. 
In Turkey, our new sampling has a particular 
focus on the western (Aegean, Marmara), north- 
ern (Black Sea), and eastern (Eastern Anatolia, 
Southeastern Anatolia) regions connecting it 
with the rest of the Southern Arc. Another area 
of high-density sampling is Armenia, with sub- 
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stantial coverage of the Bronze and Iron Ages 
representing an order of magnitude more in- 
dividuals than previously available. Many in- 
dividuals of the Bronze-to-Iron Age time frame 
are also sampled from the Iranian highlands 
at Hasanlu, where only a single individual has 
previously been studied (2), and from Dinkha 
Tepe, neighboring Anatolia, Mesopotamia, 
Armenia, and the Caucasus. In the southern part 
of Southeastern Europe, we sample Mycenaean- 
era individuals from multiple regions of the 
Aegean. From the Southern Balkans, we pre- 
sent a full time transect of Albania; numerous 
individuals from North Macedonia, where 
previously data from only a single Neolithic 
individual had been published (3); and more 
than double the previously available body of 
ancient DNA data from Bulgaria. Farther north, 
at the western wing of the Southern Arc, we 
sample individuals from Croatia, Montenegro, 
and Serbia in the west and Romania and 
Moldova in the east, which interface with the 
extensively studied worlds of Central Europe 
and the Eurasian steppe. This dataset includes 
>100 Bronze Age individuals, including many 
from Cetina Valley and Bezdanja¢éa Cave in 
Croatia, which add to only five previously 
published from the entire area (3, 4). Some 
of the Balkan individuals include culturally 
Yamnaya individuals from Serbia and Bulgaria, 
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allowing us to compare them with those of the 
Eurasian steppe. With this greatly enhanced 
dataset across the entire region, we are able to 
fill in major gaps in sampling in time, space, 
and cultural context. Our large sample sizes 
also allow us to identify main clusters as well 
as genetic outliers, providing insights about 
within-population patterns of variation and 
contact networks with neighboring groups. 
Details of all studied individuals can be found 
in (J) (figs. S5 to $21). 

To discuss the geographic distribution of 
these individuals, we take a flexible approach, 
in some cases using the names of ecological or 
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topographical regions and in others the names 
of present-day countries depending on how 
well these align with genetic patterns. In some 
cases, we also use more specific regional loca- 
tion information to add precision (5). In the 
interest of having a uniform nomenclature 
that is easily accessible to readers familiar with 
the current political map of the Southern Arc, 
we also refer to groups of individuals with 
labels prefixed with three-letter International 
Standards Organization (ISO) codes for coun- 
tries, as in Fig. 1. Multiple toponyms have been 
used for the same sites during the Southern 
Arc’s long history, and we typically choose labels 
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appropriate for the period and/or present-day 
usage. To designate the period in which individ- 
uals lived, we use conventional archaeological 
designations for each region; e.g., Eneolithic 
and Chalcolithic both denote copper-using 
cultures in different parts of the archaeological 
literature. We caution that the transition be- 
tween the Eneolithic or Chalcolithic and the 
Bronze Age did not occur simultaneously in 
different parts of the Southern Arc. Detailed 
archaeological information for each individual 
is presented in (7), specifying the analysis labels 
we use integrating information from chronol- 
ogy, geography, archaeology, and genetics. 
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Fig. 1. Studied individuals and PCA analysis. (A) The geography of the Southern Arc. Sampling locations of 
previously published individuals are shown as gray circles, new data on published individuals are shown by 
pink squares, and new individuals are shown as yellow circles. Convex hulls of individuals from each present-day 
country are also shown. (B) Timeline of studied individuals (random uniform jitter applied to the vertical 
dimension). (C) Principal components analysis of ancient individuals projected on modern West Eurasian 
variation. Country names are represented by three-letter International Standards Organization (ISO) codes. 
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Overview of genetic variation in the Southern Arc 

To understand genetic variation in the South- 
ern Arc, we began with ADMIXTURE (fig. S1) 
analysis, which allowed us to detect individu- 
als with non-West Eurasian-associated ances- 
try (6) and to appreciate the broad pattern of 
variation in terms of the four West Eurasian 
components that appear in the ADMIXTURE 
analysis: Iran/Caucasus-related, “Eastern hunter- 
gatherer,” Anatolian/Levantine-related, and 
“Balkan hunter-gatherer.” Principal compo- 
nents analysis (Fig. 1C) of Southern Arc indi- 
viduals together with other West Eurasian 
individuals demonstrates the central position 
of the Southern Arc within the continuum of 
West Eurasian variation, with a long “bridge” 
of individuals joining Europe (left) to West 
Asia (right), but with individuals spread across 
the entire range of variation. 

To quantify the ancestry of Southern Arc 
individuals, we developed a five-source mod- 
eling framework (using qpAdm and F4admix) 
(2) that allows a high-resolution description of 
the ancestry of the Southern Arc population as 
a whole and as individuals. To generate this 
model, we used an automated procedure that 
did not preselect a specific set of surrogates for 
the source populations, but instead explored 
many possible sets and identified those that, 
for as many individuals as possible, maximized 
the quality of the statistical fit of the model 
while minimizing the standard errors in infer- 
ences of ancestry proportions (tables S1 to S21 
and figs. S22 to S27). After applying this pro- 
cedure, the five sources of ancestry that we used 
are: Caucasus hunter-gatherers (CHG) (7), East- 
ern hunter-gatherers (EHG) from Europe (8, 9), 
Levantine Pre-Pottery Neolithic (J0), Balkan 
hunter-gatherers from the Iron Gates in Serbia 
(3), and Northwestern Anatolian Neolithic from 
Barcin (9). These correspond to the four-source 
ADMIXTURE model, with further distinction 
between the Anatolian and Levantine ends of 
the “Mediterranean” interaction zone (17). These 
five sources should not be unduly emphasized 
beyond their utility as a descriptive convenience 
because (i) they could be swapped for related 
ones [e.g., Neolithic Iran captures much of the 
same deep ancestry as Caucasus hunter- 
gatherers do (J0, 11)], (ii) they were themselves 
derived from earlier (more “distal”) popula- 
tions [e.g., Levantine Pre-Pottery Neolithic 
from earlier Natufian hunter-gatherers (J0)], 
and (iii) they transmitted their ancestry through 
later (more “proximal”) sources [e.g., Eastern 
hunter-gatherers through Yamnaya steppe 
pastoralists (8)]. The inferred proportions of 
ancestry for individuals are summarized in 
figs. S2 to S4 and figs. S28 to S76 and are dis- 
cussed in detail in (J). 


The Anatolian core of the Southern Arc 


When we apply our five-way model to in- 
dividuals from Anatolia (Fig. 2, A to E), it is 
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from the Pottery Neolithic to the Roman/Byzantine period. Boxes in this and 
subsequent figures indicate the temporal extent (horizontal) and 95% confidence 
interval (+1.96 SE) for each period; we also show the fit (solid line) and 5/95% 
(dotted lines) of the fit of a heteroskedastic Gaussian process (53) on the individuals 
without any assignment to populations, which allows us to appreciate the degree 
of variation in ancestry in each time period (ancestry proportions for some 
individuals are shown as negative, reflecting statistical uncertainty in the estimates). 
Here and in subsequent figures, numbers in brackets are sample sizes. The results 
show that across the peninsula, the post-Neolithic period was characterized by 


related ancestry (E). EHG-related ancestry from both the steppe/Eastern Europe 
(B) and the Balkans (D) was insignificant until the past 3000 years. (F) A detailed 
look at the Chalcolithic/Bronze Age period showing that populations had ancestry 
intermediate between early farmers from Western/Central Anatolia [Barcin (9), 
Tepecik-Ciftlik (13), and Catalhdyiik (12)] and Southeastern Anatolia (Northern 
Mesopotamia at Mardin) on the other, the result of admixture between the preceding 
Neolithic populations, without discernible external influences (that would have 
elevated any of the five components above their Neolithic levels). PPN, Pre-Pottery 
Neolithic. +1 SE shown. 


immediately apparent that before ~3000 years 
ago, virtually all ancestry is drawn from local 
West Asian sources (Northwest Anatolian 
Neolithic, hereafter called “Anatolian,” Levan- 
tine, Caucasus), with negligible contribution 
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from the two European (Balkan and Eastern 
hunter-gatherer) sources of our model. Broad- 
ly speaking, the temporal trend is one of in- 
creasing Caucasus/Levantine-related ancestry 
between the Neolithic and Chalcolithic periods, 
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with a corresponding decrease of the Anatolian- 
related ancestry. To better understand this 
process in the Anatolian peninsula, we ex- 
amined geographical subpopulations of the 
Chalcolithic and Bronze Age compared with 
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the Neolithic ones that preceded them (Fig. 
2F). We observed that Northwest Anatolian- 
related ancestry varied between ~100% (at 
Barcin, Mentese, and Ilipinar in the Marmara 
region; we use the high-quality data we have 
from Barcin to define this component of an- 
cestry) to ~16% (the Pre-Pottery Neolithic in- 
dividual from Mardin in Southeast Anatolia/ 
North Mesopotamia). Conversely, Caucasus/ 
Levantine ancestry varied between ~50 and 
~32% in North Mesopotamia to ~0% in North- 
west Anatolia. 

The Chalcolithic period in Anatolia has a 
wide temporal range (Fig. 2) that spans from 
the end of the Neolithic (~6000 BCE) to the 
beginning of the Bronze Age (~3000 BCE). In- 
dividuals in our analysis are mostly from the 
Late Chalcolithic (after ~4500 BCE) and from 
the entirety of the Bronze Age (down to 1300 
BCE). Both Chalcolithic and Bronze Age pop- 
ulations from all regions generally had inter- 
mediate admixture proportions within the 
Neolithic ranges of ancestry. This suggests 
that they could be modeled as drawn from 
mixtures of the preceding Neolithic popula- 
tions. In the Marmara region, Caucasus hunter- 
gatherer ancestry increased from ~0 to ~33% 
between the Neolithic and Chalcolithic periods 
[to define the Chalcolithic, we added four in- 
dividuals from Ilipinar to a single one from 
Barcin previously published (J0)]. In the Cen- 
tral region, we document an increase from 
~10 to 15% at Neolithic Catalhoyiik (72) and 
Tepecik-Ciftlik (73) to a similar ~33% at Chal- 
colithic Camlibel Tarlasi (74) and ~42% at 
Bronze Age Kalehéyiik and Ovaoéren (/5). In 
the Mediterranean region (Southwest Anatolia), 
the same approximate one-third proportion was 
present at Harmanoren Gondiirle (16) in the 
Bronze Age. In the Aegean region (Western 
Anatolia), we observe a similar ~29% in the 
Bronze Age. Thus, individuals from more west- 
ern regions of Anatolia (Marmara, Aegean, 
Central, and Mediterranean) all had more 
Caucasus-related ancestry (and corresponding- 
ly less Anatolian-related ancestry) during the 
Chalcolithic and Bronze Age than the preced- 
ing Neolithic populations of the area, suggest- 
ing that a spread of this ancestry westward 
across the peninsula occurred after the Neo- 
lithic, a pattern also observed in the Levant 
(11). In the more eastern regions of Anatolia 
[East, in Arslantepe (/4); Southeast, from Bat- 
man, Gaziantep, Kilis, and Sirnak (new data) 
and Titris Héyiik (74); Black Sea, from Devret 
H6yiik in Amasya and Samsun (new data) and 
Ikiztepe (J4)], populations of the Chalco- 
lithic and Bronze Age periods had, conversely, 
more Western Anatolian Neolithic-related, 
and less Caucasus-related ancestry, than the 
Pre-Pottery Neolithic individual from Mardin. 
This pattern is also observed when we com- 
pare the Chalcolithic with the Bronze Age. 
Differences are small but all in the direction 
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of more Western Anatolian Neolithic-related 
ancestry (an increase of ~3 to 7% in the East, 
Southeast, and Black Sea regions) except in 
the Hatay Province (/4), where Western Ana- 
tolian Neolithic-related ancestry decreased and 
Caucasus-related ancestry increased (from 
~14 to 43%) between the Early Chalcolithic 
(~5500 BCE) and the Middle to Late Bronze 
Age (after ~2000 BCE). 

Taken as a whole, the genetic history of 
Anatolia during the Chalcolithic and Bronze 
Age can be characterized as one of homogeni- 
zation. Neolithic populations differed by as 
much as ~80% in terms of Western Anatolian 
Neolithic-related and by ~50% in terms of 
Caucasus-related ancestry. In the Chalcolithic 
and Bronze Age, the range of these differences 
narrowed substantially. That of Western Ana- 
tolian Neolithic-related ancestry halved to 
~40% (becoming ~20 to 60%) and that of 
Caucasus-related ancestry to ~15% (becoming 
~30 to 45% except in the Hatay Province). 
Despite this homogenization, some ancestry 
differences persisted. The eastern regions re- 
tained more Caucasus-related ancestry than 
the western ones, but the overall pattern was 
one of attenuated differentiation after intra- 
Anatolian gene flow stemming from the highly 
differentiated Neolithic populations of Western/ 
Central Anatolia on the one hand and North- 
ern Mesopotamia on the other (as well as 
hitherto unsampled others). 

Homogenization in Anatolia was coupled by 
impermeability to exogenous gene flow from 
Europe, which could be explained by either a 
large and stable population base that atten- 
uated the demographic impact of external 
immigration or cultural factors impeding it. 
The asymmetry of gene flow between Anatolia 
and its neighbors is evident, for example, in the 
fact that Caucasus hunter-gatherer-related 
ancestry flowed westward across Anatolia into 
the Balkans and northward into the Eurasian 
steppe, but Balkan hunter-gatherer ancestry 
did not flow into Anatolia or further eastward, 
and Eastern hunter-gatherer ancestry entered 
West Asia only as far south as Armenia and, to 
a lesser extent, Iran (as we will see below). This 
was true even down to the Urartian period of 
the Iron Age, when a population lacking East- 
ern hunter-gatherer ancestry still existed in 
the center of the Kingdom of Van (6). 


The origin and expansion of steppe pastoralists 


The absence of European hunter-gatherer ad- 
mixture in Anatolia during the Chalcolithic 
and Bronze Age periods contrasts with devel- 
opments to the north of the Southern Arc and 
north of the Black and Caspian Seas, which 
saw the formation of Eneolithic (a term used 
instead of Chalcolithic for this area) and Bronze 
Age pastoralist populations that harbored a 
mixture of populations from Eastern Europe 
and the Southern Arc (8, 9, 77). Examining 
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individuals from the steppe (Fig. 3), we observe 
that in the post-5000 BCE period, Caucasus- 
related ancestry is added to the previous East- 
ern hunter-gatherer population, forming the 
Eneolithic populations at Khvalynsk (9) and 
Progress-2 (17); this ancestry persisted in the 
Steppe Maykop population of the 4th millen- 
nium BCE (17). However, all of these popula- 
tions before ~3000 BCE lack any detectible 
Anatolian/Levantine-related ancestry, con- 
trasting with all contemporaneous ones from 
the Southern Arc, which have at least some 
such ancestry at least since the Neolithic (77). In 
all later periods in the Southern Arc, Caucasus 
hunter-gatherer-related ancestry is never found 
by itself but rather is always admixed, to var- 
ious degrees, with Anatolian/Levantine ances- 
try. This suggests that whatever the source of 
the Caucasus-related ancestry in the Eneo- 
lithic steppe, it cannot have been from the 
range of variation sampled in the Southern Arc 
because this would have introduced Anatolian/ 
Levantine-related ancestry. This implies that 
the proximal source of the Caucasus-related 
ancestry in the Eneolithic steppe should be 
sought in an unsampled group that did not 
experience Anatolian/Levantine-related gene 
flow until the Eneolithic. Plausibly, this pop- 
ulation existed in the North Caucasus, from 
which Caucasus hunter-gatherer-related, but 
not Anatolian/Levantine-related, ancestry could 
have entered the Eneolithic steppe. 

The Eneolithic steppe population contrasts 
with that of the Yamnaya cluster of individuals 
~3000 BCE, which does have significant Ana- 
tolian (3 + 1%)-and Levantine (3.5 + 1%)- 
related ancestry [Fig. 3A; steppe individuals 
in this analysis are listed in (7)]. This inference 
is further supported by detailed analysis of 
Yamnaya ancestry at different time depths 
(tables S22 to S28) (2), which indicates that 
they derived from at least two southern sources. 
The first source dates to the Eneolithic and 
includes Caucasus hunter-gatherer ancestry 
only. The second source dates to before the 
formation of the Yamnaya cluster and includes 
Anatolian/Levantine-related ancestry in addi- 
tion to Caucasus hunter-gatherer (as deep 
sources), ancestry related to Neolithic people 
of Armenia (more proximally), or ancestry 
related to Chalcolithic people of the Caucasus 
to Southeast Anatolia (even more proximally). 
A more direct and geographically proximate 
source in the Maykop population of the North 
Caucasus of the 4th millennium BCE has also 
been proposed (78). Although the exact source 
cannot at present be determined (all of the 
candidates have different combinations of the 
same Anatolian/Levantine/Caucasus ancestry; 
fig. S1), it was people drawn from this metapop- 
ulation in the Chalcolithic Caucasus, Armenia, 
and East/Southeast Anatolia that must have 
been responsible for the second pulse of South- 
ern Arc ancestry into the precursors of Yamnaya 
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Fig. 3. Yamnaya origins and expansions. (A) The earliest inhabitants of the steppe (EHG) were followed by 
CHG-admixed populations by ~5000 BCE and by Anatolian/Levantine-admixed populations by ~3000 BCE 
with the emergence of the Yamnaya-Afanasievo genetic cluster. The proportion of Balkan hunter-gatherer— 
related ancestry (not shown) is 0.8 + 0.6% in the Yamnaya cluster and —0.5 + 0.5% in the Afanasievo. (B) The 
Yamnaya had nearly half their ancestry from CHG, higher than any Bronze Age Europeans from the Balkans, 
Italy, or Central/Northern Europe, but their CHG-EHG balance was equal, similar to the Corded Ware/Beaker 
clusters of Central/Northern Europe and contrasting with Southeastern and Mediterranean Europe, where CHG 
was significantly higher than EHG. 95% confidence intervals of +1.96 SE are shown. 


steppe pastoralists. The genetic contribution of 
the second pulse may have been as low as 
6.5%, the sum of Anatolian and Levantine an- 
cestry in the Yamnaya, or as high as 53.1%, the 
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totality of the combined Caucasus hunter- 
gatherer and Anatolian/Levantine ancestry. 
The low end is unlikely because Caucasus 
hunter-gatherer ancestry was ubiquitous in 
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West Asia during the Chalcolithic period and 
some of it should be added to the 6.5% figure. 
The high end is also unlikely because it sug- 
gests that all Caucasus hunter-gatherer ances- 
try flowed northward with the second pulse, 
thus ignoring the evidence for its independent 
flow into the Eneolithic steppe. Our modeling 
suggests intermediate values of ~21 to 26% 
(table S28), in the middle of the 6.5 to 53.1% 
range, an estimate that may be updated in the 
future as better proximate sources in both 
West Asia and the steppe come to light. 
Archaeological evidence documents how 
western steppe populations interacted with 
European farmer groups such as the Cucuteni- 
Trypillia and Globular Amphora cultures, and 
it was previously suggested that ancestry from 
such groups contributed to the ancestry of the 
Yamnaya (/7). Our genetic results contradict 
this scenario because European farmers were 
themselves a mixture of Anatolian Neolithic 
and European hunter-gatherer ancestry, but 
the Yamnaya lacked the European hunter- 
gatherer ancestry differentiating European 
from West Asian farmers, and had an ~1:1 ratio 
of Levantine-to-Anatolian ancestry in our 
five-way model, contrasting with the over- 
whelming predominance of Anatolian ancestry 
in European farmers. The Caucasus hunter- 
gatherer/Eastern hunter-gatherer/Western 
hunter-gatherer/Anatolian Neolithic model 
of (17) fails (P < 1 x 107°) because it under- 
estimates shared genetic drift with Levantine 
farmers (Z = 5.6), whose contribution into the 
Yamnaya cannot be explained under that 
model. These results shift the quest for the 
ancestral origins of a component of Yamnaya 
ancestry firmly to the south of the steppe and 
the eastern wing of the Southern Arc. Deter- 
mining the proximate source of the two move- 
ments into the steppe from the south will 
depend on further sampling across the Anatolia- 
Caucasus-Mesopotamia-Zagros area where 
populations with variations of the three com- 
ponents existed. Similarly, on the steppe side, 
study of Eneolithic (pre-Yamnaya) individuals 
could disclose the source dynamics of Caucasus 
hunter-gatherer infiltration northward and 
identify the likely geographical region for the 
emergence of the distinctive Yamnaya cluster, 
which we show has an autosomal signal of 
admixture dating to the mid-5th millennium 
BCE [fig. S5 and (19)], coinciding with the direct 
evidence of the first southern influence pro- 
vided by the Eneolithic individuals of the steppe. 
The role of Yamnaya-like populations in 
spreading both Eastern hunter-gatherer and 
West Asian ancestry into mainland Europe 
has been previously recognized (8), but it has 
also become apparent that some of the latter 
entered Europe independently of steppe ex- 
pansions into the Aegean (9, 16), Sicily (20), 
and even as far west as Iberia (2/7) by the 
Bronze Age. We observe that the Caucasus 
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minus Eastern hunter-gatherer ancestry dif- 
ference in the Yamnaya is ~0% (Fig. 4B), and 
this allows us to both test whether steppe mi- 
grants into mainland Europe may have orig- 
inated from a different steppe population (with 
a nonequal balance of Caucasus and Eastern 
hunter-gatherer components) and whether ad- 
ditional migrations (with either more Eastern 
or Caucasus hunter-gatherer ancestry, thus 
shifting the difference away from zero) oc- 
curred. We find that the Corded Ware and 
Bell Beaker complex individuals from Europe 
are all consistent with a balanced presence of 
the two components (consistent with having 
been transmitted through a Yamnaya-like pop- 
ulation). Even in the early Corded Ware from 
Bohemia, where a third “northern” source has 
been suggested to have been substantially in- 
volved (22), the difference is one of a small 
3.1 + 2.1% excess of Eastern hunter-gatherer 
ancestry, which is entirely consistent with 
being transmitted entirely by the Yamnaya to 


the limits of the resolution of our statistical 
analysis. This is not the case for Southeastern 
Europe, where Bronze Age individuals had 
an excess of Caucasus over Eastern hunter- 
gatherer ancestry not only in the Aegean (~17% 
in both Minoans and Mycenaeans) (J6), but 
throughout the Balkan peninsula (Fig. 3B), 
where the overall Bronze Age excess is 7.4 + 
1.7% (with by-country estimates of ~4 to 13%). 
A possible explanation for this excess is the 
existence of a small 5.2 + 0.6% Caucasus 
hunter-gatherer component in the Neolithic 
substratum of Southeastern Europe (Fig. 4A); 
we estimated that this proportion is ~O to 1% in 
four separate Early Neolithic populations from 
Hungary (Staréevo-K6r6s cultural complex), 
France, Spain, and the Linearbandkeramik 
of Austria, Germany, and Hungary (3, 23-30). 
Thus, the Bronze Age Caucasus hunter-gatherer 
ancestry in Southeastern Europe compared 
with Central/Northern/Western Europe may rep- 
licate this contrast from the Neolithic. However, 


the even higher levels observed in the Aegean 
[Fig. 3B and (6)] suggest additional gene flow 
after the Neolithic by the time of the Early 
Bronze Age (31). 


Interplay of local, steppe, and West Asian 
ancestries in Southeastern Europe 


Southeastern Europe interfaces geographi- 
cally with both the Eurasian steppe and Ana- 
tolia, and its genetic history (Fig. 4) bears 
traces of both connections, starting from the 
partial replacement of its local Balkan hunter- 
gatherers by Anatolian Neolithic farmers 
beginning ~8500 years ago, followed by the 
expansion of Eastern hunter-gatherer-ancestry- 
bearing steppe populations ~5000 years ago 
(3). Although the Bronze Age was a period of 
partial homogenization in Anatolia, as we have 
seen, in Southeastern Europe, it was a time of 
substantial contrasts. 

One aspect of this heterogeneity was the 
retention of the local Balkan hunter-gatherer 
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Fig. 4. Genetic heterogeneity in Southeastern Europe after the Yamnaya 
expansion. (A to E) Five components of ancestry in Southeastern Europe. The 
replacement of hunter-gatherer by early farmer ancestry [(D) and (E)] was 
followed by the rise of CHG and EHG ancestry over the past 5000 years [(A) and 
(B)], with Levantine ancestry being relatively unimportant and showing no 
discernible temporal pattern (C). In (F), we show a linear regression of population 
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estimation of 28 years. 
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dates (using directly radiocarbon-dated individuals for each population) on 
admixture times in generations; more recent populations have older admixture 
times, and the regression places admixture between populations related to the 
Southeast European Neolithic and Yamnaya at 4853 + 205 years ago and the 
generation length at 28 + 4 years, virtually identical to its independent empirical 
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ancestry itself, which was detected only in the 
Balkans (within the Southern Arc), thus pre- 
cluding any substantial migration from the 
area to the rest of the Southern Arc. Balkan 
hunter-gatherer ancestry was variable during 
the Bronze Age and related to geography. A 
marked contrast is found within Romania, 
where our new data show that it makes up ~12% 
of the ancestry of 42 individuals from the 
Bodrogkeresztar Chalcolithic and ~24 to 30% 
in 10 Bronze Age individuals from Carlomanesti 
(Arman) and from Ploiesti and Targsoru Vechi 
south of the Carpathian Mountains. Together 
with another Bronze Age individual from 
Padina in Serbia [2460 to 2296 calibrated 
(cal) BCE] near the Iron Gates, whose Balkan 
hunter-gatherer ancestry was ~37%, these re- 
sults prove substantial hunter-gatherer an- 
cestry preservation in the North Balkans 
postdating the arrival of both Anatolian Neo- 
lithic and steppe ancestry in the region. This 
contrasts with the southern end of the Balkan 
peninsula in the Aegean (6), where neither the 
Neolithic nor the Bronze Age populations 
had any significant Balkan hunter-gatherer 
ancestry, raising the question of whether the 
region’s pre-Neolithic population was more 
similar to that of the North Balkans (Balkan 
hunter-gatherer-like) or Western Anatolia (and 
thus similar to the Neolithic population). 

The key driver of the Bronze Age hetero- 
geneity was the appearance of Eastern hunter- 
gatherer ancestry that became ubiquitous in 
Southeastern Europe after its sporadic Chal- 
colithic appearance (3). This is most evident 
(~31 to 44%) in Moldova at several Bronze Age 
sites, including those of the Catacomb and 
Multi-cordoned Ware cultures, and individuals 
from Romania (Trestiana and Smeeni) on the 
eastern/southeastern slopes of the Carpathians, 
which contrast with the high-Balkan hunter- 
gatherer group from Arman. We also detect a 
contrast between Catacomb culture individuals 
from Moldova and those from the Caucasus 
(17), driven by an individual from Purcari with 
substantial (17 + 4%) Anatolian Neolithic an- 
cestry, suggesting some heterogeneity within 
this culture on opposite sides of the Black 
Sea. For the rest of the Balkans, the amount 
of Eastern hunter-gatherer ancestry is ~15% 
and drops to ~4% in Mycenaean Greece and 
to negligible levels in Minoan Crete (6, 16). 

Our study identifies a “high-steppe ances- 
try’ set of individuals, a term we use to refer 
to individuals from the Balkans during the 
Early Bronze Age who had unusually high 
proportions of Eastern hunter-gatherer ances- 
try compared with their contemporaries (Fig. 
4B). This includes two previously published 
individuals from Nova Zagora in Bulgaria and 
Vucedol in Croatia (3), as well as five newly 
reported individuals, including an Early Bronze 
Age individual from Cinamak in Albania (2663 
to 2472 calBCE) and four that are culturally 
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Yamnaya: one from Vojlovica-Humka in Serbia, 
two from Boyanovo, and one from Mogila in 
Bulgaria. In aggregate, this group of Balkan 
individuals has 35.9 + 2.5% Eastern hunter- 
gatherer, 36.4 + 1.9% Caucasus hunter-gatherer, 
and 23.0 + 1.9% Anatolian Neolithic ancestry 
compared with the Yamnaya cluster individ- 
uals (46.1 + 1.0%, 46.6 + 1.6%, and 3.0 + 1.0%, 
respectively), i.e., the same Caucasus/Eastern 
hunter-gatherer balance as the Yamnaya but 
diluted by about one-fifth by local Neolithic 
ancestry of ultimately Anatolian origin. 

When we use DATES (19) to date the admix- 
ture of steppe ancestry in populations of South- 
eastern Europe (Fig. 5F and fig. S6), we arrive 
at an estimate that this took place ~4850 years 
ago, i.e., precisely after the Yamnaya expan- 
sion, and within the time frame of our “high- 
steppe” cluster individuals. This suggests that 
(as a first approximation) steppe ancestry in 
Southeastern Europe from the Bronze Age on- 
ward was largely mediated by descendants of 
Yamnaya and local Balkan populations and 
not by earlier waves out of the steppe that 
affected the region sporadically. This admix- 
ture need not have taken place in one locality, 
as indicated by the presence of Yamnaya-like 
individuals in several regions of the Balkans, 
spatially beyond both the cultural transition 
zone between steppe pastoralist and settled 
populations (32), and the geographical one 
from the Eastern European flatlands into 
mountainous areas. 


Armenia: Fluctuating steppe ancestry against 
a persistent West Asian genetic background 


Armenia is situated in the highlands of West 
Asia to the east of Anatolia and to the south 
of the Caucasus mountains separating West 
Asia from the Eurasian steppe to the north. 
When we examine the trajectory of ances- 
try there (Fig. 5), we observe that the local 
Caucasus hunter-gatherer-related ancestry 
(Fig. 5A) has always been the most important 
component of the population from the Neo- 
lithic to the present, making up ~50 to 70% of 
ancestry over the past 8000 years. As in Ana- 
tolia, the two other components of West Asian 
ancestry had a strong presence as well, making 
up most of the remainder. 

The most noticeable feature of the history 
of Armenia compared with all other Asian re- 
gions of the Southern Arc is the tentative ap- 
pearance of Eastern hunter-gatherer ancestry in 
the Chalcolithic at Areni-1 Cave (10) ~6000 years 
ago (Fig. 5B), followed by its disappearance 
~5000 years ago with the Early Bronze Age 
Kura-Araxes culture and its reappearance at 
the Middle Bronze Age, when a level of ~14% 
was followed by ~10% in the Late Bronze 
Age and Iron Age and then diluted to ~7% by 
the Urartian period of the first half of the 1st 
millennium BCE and to the ~1 to 3% levels 
observed since the second half of that millen- 
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nium at sites such as Aghitu and through the 
medieval period (at Agarak) down to present- 
day Armenians. When we compare the Middle/ 
Late Bronze Age individuals from Armenia 
(when Eastern hunter-gatherer ancestry was 
highest and from which we have individuals 
from >20 sites) with other West Asian European 
and steppe populations (Fig. 5E), it is evident 
that Armenia is an outlier. Populations from 
Armenia have significantly more such ancestry 
than all surrounding populations: Anatolia and 
the Levant, where this ancestry is undetected 
during the Bronze Age; Iran, where it makes up 
~2% overall; and even the Maykop cluster pop- 
ulations of the North Caucasus (17), where it 
reaches ~3%. These analyses in Armenia show 
that Eastern hunter-gatherer ancestry flowed 
from the steppe not only west of the Black Sea 
into Southeastern Europe, attaining its mini- 
mum in the Aegean and east of it, but also across 
the Caucasus into Armenia. However, substan- 
tial proportions of steppe ancestry spread no 
further into Anatolia from either west or east. 

The appearance of Eastern hunter-gatherer 
ancestry at Areni-1 Cave is the first known 
genetic influence of peoples of the Eurasian 
steppe on West Asia, although with our cur- 
rent sparse sampling of the Eneolithic steppe, 
we do not know the precise geographical 
source of this ancestry within the steppe. 
The Areni individuals date to the same 5th 
millennium BCE, in which we saw that the 
Eneolithic steppe came to be influenced by 
Caucasus hunter-gatherer-related ancestry 
from the south and to which our admixture 
dating of Yamnaya origins also points. How- 
ever, it was only during the Middle/Late 
Bronze Age that Eastern hunter-gatherer 
ancestry became entrenched in Armenia, at 
least for a while, forming an “enclave” of steppe 
influence in West Asia that eventually dis- 
sipated during the Ist millennium BCE. This 
period of relatively high-steppe ancestry cor- 
responds to the “Lchashen-Metsamor” culture 
of the Bronze-to-Iron Age (1). Linkage disequil- 
ibrium dating of steppe admixture (Fig. 5F) in 
our extensive set of individuals of average late 
2nd millennium BCE date suggests it occurred 
a millennium and a half earlier, at the middle 
of the 3rd millennium BCE, and thus in paral- 
lel to the transformation of mainland Europe 
and the Balkans. In Armenia itself, the mid-3rd 
millennium BCE corresponds to the demise of 
the Kura-Araxes culture and its succession 
by the “Early Kurgan” culture, followed during 
the end of that millennium by the “Trialeti- 
Vanadzor” complex from which an individual 
from Tavshut (2127 to 1900 calBCE) already 
has the ~10% Eastern hunter-gatherer an- 
cestry of the Lchashen-Metsamor population, 
the first documented steppe descendant in 
Armenia two millennia after the Chalcolithic. 
The analysis of Y chromosomes to which we now 
turn provides an independent line of evidence 
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Fig. 5. A genetic history of Armenia. Shown are changes in the four components 
of ancestry. (A) CHG is the most important component in all ages, rising to 
its maximum in the Kura-Araxes culture of the Early Bronze Age. (B) EHG 
ancestry first appears in the Chalcolithic at Areni Cave, disappears during the 
Kura-Araxes period, reappears strongly in the Middle-to-Late Bronze Age 


period, and decreases to about one-third of its peak value by ~2000 years ago. 


(C and D) Levantine and Anatolian ancestry were present in all periods as 
minority components. Balkan hunter-gatherer ancestry (not shown) is <1% in all 


periods. All individuals shown are from Armenia save for two Neolithic and a 
Chalcolithic individual previously published from Azerbaijan. (E) During the Middle-to- 
Late Bronze Age peak, Armenia had more EHG ancestry than its neighbors in West 
Asia (Anatolia, the Levant, and Iran). (F) C’“-dated Bronze-to-lron Age individuals 
from Armenia admixed 52.2 + 8.0 generations (1460 + 224 years) before their average 
date of 1119 BCE, or ~2579 BCE (mid-3rd millennium BCE), assuming a generation 
length of 28 years (54). We use Early Bronze Age Armenia and Yamnaya cluster 
individuals from Russia as proxy sources. 


for a link between the Yamnaya and popula- 
tions of Armenia after this 3rd millennium 
BCE reappearance of Eastern hunter-gatherer 
ancestry. 


Y-chromosome links between the steppe and 
West Asia in their genome-wide context 


Y-chromosome variation (tables S29 to S34 
and figs. S77 and S79) (7) can be used to pro- 
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vide confident upper bounds on the date when 
two populations shared ancestors because 
the large number of mutations that can be 
analyzed over almost 10 million nucleotides 
of alignable sequence means that the split 
times in the genealogy are accurately known. 
The ancient individuals’ Y-chromosome anal- 
ysis also has the potential to provide insight 
into social processes. 
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Subclades of Y-chromosome haplogroup 
R-L389 are particularly informative for tracing 
connections between the Southern Arc and 
the Eurasian steppe (Fig. 6). First, haplogroup 
R-V1636, with an inferred common ancestor 
in the 5th millennium BCE, documents gene 
flow between the steppe and the Southern 
Arc in the Eneolithic/Chalcolithic period (Fig. 
6B). R-V1636 is present in two individuals 
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Fig. 6. Y-chromosome links between the Southern Arc and the Eurasian steppe. (A) Phylogeny of haplogroup R-L389 (Riblal) with TMRCA estimates of yfull. 
com. (B) CHG/EHG ancestral composition of R-L389 Y-chromosome individuals. (C) R-L389 individuals from the Southern Arc, representing a subset of the individuals 


plotted in (B). Individuals >2000 years old are shown 


from the Late Chalcolithic at Arslantepe (Turkey) 
(14) and the Early Bronze Age in Armenia at 
Kalavan (J0). It is also found in the piedmont 
of the North Caucasus at Progress-2 (17), the 
open steppe at Khvalynsk II (9), and the Single 
Grave Culture of Northern Europe (Gjerrild) 
(33). The individuals from Armenia and 
Arslantepe lack any detectible Eastern hunter- 
gatherer autosomal ancestry (Fig. 6C), which 
is maximized in the Khvalynsk individuals, an 
observation that provides some evidence for a 
southern origin for the R-V1636 haplogroup 
(we caution, however, that the haplogroup oc- 
curs earlier in several sites in the north, which 
could be consistent with an alternative scena- 
rio in which male migrants from the steppe 
introduced it into Southern Arc populations 
during the Chalcolithic, but their autosomal 
genetic legacy was diluted by the much more 
numerous locals). The earliest individuals from 
the R-L389 clade belong to the R-P297 sister 
clade of R-V1636, including the hunter-gatherer 
from Lebyazhinka IV (8, 9) and hunter-gatherers 
from the Baltic region (3), both without Caucasus 
hunter-gatherer ancestry, suggesting an East- 
ern European origin of this clade that would 
eventually give rise to the R-M269 clade that 
spread extremely widely in the Bronze Age. 
Haplogroup R-M269, which is inferred to 
have a shared common ancestor in the 5th 
millennium BCE, is crucial for understanding 
steppe expansions because it was the domi- 
nant lineage of the Yamnaya-Afanasievo group 
(4, 8, 34) in its 4th millennium BCE R-Z2103— 
R-M12149 sublineage. In the Balkans, a group 
of six Bronze Age individuals from the 3rd 
millennium BCE carrying R-M269 (Fig. 6C) 
are associated with >30% Eastern hunter- 
gatherer ancestry, and this includes not only 
Catacomb and Multi-cordoned Ware individuals 
from Moldova, adjacent to the steppe, but also 
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. ka, thousand years ago. 


from farther south, including two Yamnaya 
males from Bulgaria (Boyanovo and Mogila, 
the latter associated with Yamnaya burial cus- 
tom and with the R-Z2103 haplogroup typical 
of the steppe Yamnaya) and one from Albania 
(Cinamak) belonging to the high-steppe ances- 
try group. By the Late Bronze Age (late 2nd 
millennium BCE) and later, no high-steppe 
ancestry individuals are observed, but steppe- 
associated Y chromosomes persist, including 
R-Z2106, a lineage that links North Macedonia 
(Ulanci-Veles), Albania (Cinamak), the steppe, 
and Armenia. The population of Southeastern 
Europe contrasts strongly with those of the 
Central/Northern Europe and Eurasian steppe 
archaeological cultures of ~3000 to 2000 BCE 
that were strongly associated with particular 
Y-chromosome lineages: Afanasievo (4, 34) with 
the same R-Z2103 as the Yamnaya, Corded 
Ware/Fatyanovo/Sintashta (4, 8, 34, 35) with 
R-M4117, and Beaker (36) with R-L51. In South- 
eastern Europe during the Bronze Age, we 
detect 32/30/21/11 Y chromosomes belonging 
to haplogroups R/J/I/G linking it with Central/ 
Northern Europe and the steppe/West Asia/ 
local hunter-gatherers/Anatolian-European 
Neolithic farmers, respectively. Together with 
the extraordinary heterogeneity in autoso- 
mal ancestry in the Balkans, a picture emerges 
of a fragmented genetic landscape that may 
well parallel the poorly understood linguistic 
diversity in the ancient Balkans, which among 
Indo-European languages includes Paleo-Balkan 
speakers before the spread of Latin and Slavic, 
with Albanian as the only surviving represen- 
tative. Did the early Indo-European language 
become successful in Southeastern Europe be- 
cause it functioned as a “lingua franca,” fa- 
cilitating communication among speakers of 
the diverse languages of previous farmer and 


hunter-gatherer populations? 
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Our newly reported data reveal that a large 
proportion of individuals in Armenia and North- 
west Iran belonged to the R-Z2103—R-M12149 
haplogroup during the 2nd and early Ist mil- 
lennium BCE, providing a genetic link with 
the Yamnaya in these regions where no archae- 
ological presence of the Yamnaya culture it- 
self is attested. It definitely represents a more 
direct link than either R-V1636 or the early ap- 
pearance of Eastern hunter-gatherer ancestry 
at Areni-1 cave in Armenia (J0) during the Chal- 
colithic at the end of the 5th millennium BCE, 
which provides evidence of converse movement 
of Caucasus hunter-gatherer ancestry into the 
steppe Eneolithic. 

Despite the Y-chromosome movement south- 
ward attested by our data, any association 
between R-haplogroup bearers and Eastern 
hunter-gatherer ancestry was lost south of the 
steppe because these had similar proportions 
of Eastern hunter-gatherer ancestry as I-Y16419 
bearers (the second most prevalent lineage in 
Armenia). Two Bronze-to-Iron Age sites with 
substantial sample sizes [unrelated males from 
Bagheri Tchala (n = 7) and Noratus (n = 12)] 
have contrasting haplogroup distributions dom- 
inated by R-M12149 and I-Y16419, respectively 
(Fisher’s exact test P < 0.001), suggesting founder 
events, high genetic drift, or a patrilocal mat- 
ing system ~1000 BCE in Armenia. During 
the same period at Hasanlu in Northwest Iran, 
many individuals have no trace of Eastern 
hunter-gatherer ancestry at all despite the 
presence of R-M12149 there (6), suggesting 
that the initial association of this lineage with 
Eastern hunter-gatherer ancestry on the steppe 
had vanished as R-M12149 bearers reproduced 
with Southern Arc individuals without East- 
ern hunter-gatherer ancestry (Fig. 6C). 

We observe that, on the steppe, R-M12149 
Y chromosomes (within haplogroup Rib) at 
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the beginning of the 3rd millennium BCE, 
associated with the Yamnaya, were replaced 
by the beginning of the next millennium by 
R-Z93 Y chromosomes (within haplogroup 
Ria), associated with Corded Ware/Fatianovo 
(35) steppe descendants such as those of the 
Sintashta culture (34). Genetic data cannot dis- 
tinguish whether this Y-chromosome replace- 
ment was the result of competition between 
patrilineal groups from the steppe, one of 
which may have had cultural adaptations such 
as usage of an improved variety of domesti- 
cated horse (37), or whether one group simply 
filled an ecological niche vacated by earlier 
groups. A fuller understanding of the reason for 
this profound genetic change requires combined 
analysis of genetic and archaeological data. 

Whatever the reason for their demise on the 
steppe itself, the Yamnaya-descended R-Z2103 
patrilineages survived in Armenia down to 
the present day, where this clade is present in 
appreciable frequencies in all studied Armenian 
groups (38) despite the substantial dilution of 
autosomal steppe ancestry documented in our 
study. The persistent and lasting presence of 
Yamnaya patrilineal descendants in Armenia 
contrasts with mainland Europe and South Asia, 
where steppe ancestry was introduced by people 
who were not patrilineal descendants of the 
dominant R-M12149 lineage of the Yamnaya 
population. Instead, they belonged to different 
descent groups who had received autosomal 
steppe admixture while carrying different pre- 
dominant Y-chromosome lineages. Armenia 
also contrasts with Anatolia, for which no 
R-M269 Y-chromosomes are observed at all 
during the Chalcolithic, Bronze Age, or Ancient 
(pre-Roman) periods [m = 80 unrelated indi- 
viduals; 95% confidence interval (CI): 0 to 
4.5%] and in which haplogroups J (36 individ- 
uals) and G (17 individuals) are most common 
Haplogroup J is still common at a frequency 
of about one-third in present-day people from 
Turkey (39), having achieved such prominence 
despite occurring in only in one in 18 Neolithic 
male individuals from Barcin and Ilipinar in 
the Marmara region during the pre-Chalcolithic 
period. A likely explanation for the haplogroup 
J increase is that it accompanied the spread of 
Caucasus hunter-gatherer ancestry inferred 
by our admixture analysis (Fig. 2). This infer- 
ence is made plausible by the fact that both 
Caucasus hunter-gatherer individuals from 
Kotias and Satsurblia (7) and a Mesolithic 
individual from Hotu Cave (10, 34) in Iran be- 
longed to this lineage, suggesting its very old 
presence in the Caucasus/Iran region, and in 
contrast with haplogroup G, which occurred 
in the majority (10/18) of individuals from the 
Neolithic Marmara region. By the Chalco- 
lithic, haplogroups G and J were ubiquitous 
in Anatolia, each making up 10/28 males from 
that period, paralleling the homogenization 
that had occurred by that time. 
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The Indo-Hittite hypothesis in the light of 
genetic data 

We discuss the implications of our genetic 
findings for hypotheses about the origins and 
spread of Indo-European and Anatolian lan- 
guages. We also highlight a caveat: In contrast 
to findings about movements of people, the 
relevance of genetics to debates about lan- 
guage origins is more indirect because lan- 
guages can be replaced with little or no genetic 
change and populations can migrate and mix 
with little or no linguistic change. Neverthe- 
less, the detection of migration is important 
because it identifies a plausible vector for lan- 
guage change (40). 

The discoveries of massive migrations from 
the steppe both westward into Central and 
Western Europe (4, 8), and eastward into South 
Siberia (4) and Central/South Asia (34), have 
provided powerful evidence for the theory of 
steppe Indo-European origins by linking pop- 
ulations all the way from Northwest Europe 
(36) to India and China through common steppe 
ancestry. The present study adds further sup- 
port to the theory by the discovery of ubiquitous 
ancestry from the steppe in the Bronze Age 
Balkans [where, indubitably, Indo-European 
Paleo-Balkan languages such as Thracian and 
Illyrian (42) were spoken], including individu- 
als of predominantly steppe ancestry; by doc- 
umenting the ubiquity of steppe ancestry in 
Bronze and Iron Age Armenia where Armenian 
is first attested and links between Armenia, 
the steppe, and the Balkans; and by the fur- 
ther documentation of steppe ancestry in the 
Aegean (6) during the Mycenaean period when 
the Greek language is first attested, albeit 
at lower levels. All ancient and present-day 
branches of the Indo-European language fam- 
ily can be derived or at least linked to the early 
Bronze Age Yamnaya pastoralists of the steppe 
or genetically similar populations. 

A link to the steppe cannot be established 
for the speakers of Anatolian languages be- 
cause of the absence of Eastern hunter-gatherer 
ancestry in Anatolia (4, 10, 14, 16), which our 
study reinforces in three ways: (i) by docu- 
menting its paucity in ~100 new Anatolian in- 
dividuals from the Chalcolithic to pre-Roman 
antiquity, (ii) by contrasting western parts of 
Anatolia with its immediate Aegean-Balkan 
neighbors to the west, and (iii) by contrasting 
eastern/northern parts of Anatolia with its 
neighbors in Armenia in the east. Certainly, 
the absence of Eastern hunter-gatherer an- 
cestry in Anatolia can never be categorically 
proven (because more sampling can always 
disclose some such ancestry); however, at 
present, and despite extensive sampling, such 
ancestry is not detected either at possible 
entry points (west and east by land or even 
north by sea) or in the population as a whole. 

The Indo-Hittite hypothesis, first proposed by 
E. H. Sturtevant in 1926 (42), has been partially 
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supported by more modern phylolinguistic 
analyses, indicating that Anatolian languages 
such as Hittite are basal to the rest of the Indo- 
European family tree (43) and suggesting an 
early split between the two. We have shown 
that Anatolia was indeed transformed by the 
Late Chalcolithic through the spread of Caucasus 
hunter-gatherer-related ancestry to its west- 
ernmost edges, as were apparently Eneolithic 
populations of the steppe, which included also 
Anatolian/Levantine-related ancestry by the 
time of the formation of the Yamnaya pastor- 
alists. It is premature to identify the proxi- 
mate sources of these movements before all 
the candidate source populations of Anatolia, 
North Mesopotamia, Western Iran, Armenia, 
Azerbaijan, and the Caucasus have been ad- 
equately sampled. 

Our analyses show that there were at least 
two gene flows from two groups related to 
West Asians into the steppe, which transformed 
the steppe’s population and may have induced 
linguistic change there. The reverse movement 
is more tentative, with early influences from 
the north such as at Areni Cave (J0) or possibly 
associated with R-V1636 Y-chromosomes, not 
making a sizable genetic impact on the pop- 
ulation of Anatolia. The evidence is consistent 
with two hypotheses. 

Hypothesis A postulates that Proto-Indo- 
Anatolian (including both Anatolian languages 
and Proto-Indo-European) was spoken by a 
population with high Eastern hunter-gatherer 
ancestry that had a disproportionate linguistic 
impact on Anatolia while contributing little 
if any ancestry. In the post-Bronze Age land- 
scape of Anatolia, we do find outliers marked 
by European or steppe influence (6), but this is a 
period when Anatolia is influenced by numerous 
linguistically non-Anatolian Indo-European 
populations, including Phrygians, Greeks, 
Persians, Galatians, and Romans, to name 
only a few. However, in individuals from 
Gordion, a Central Anatolian city that was 
under the control of Hittites before becoming 
the Phrygian capital and then coming under 
the control of Persian and Hellenistic rulers, the 
proportion of Eastern hunter-gatherer ancestry 
is only ~2%, a tiny fraction for a region con- 
trolled by at least four different Indo-European- 
speaking groups. In medieval times, Central 
Asian ancestry associated with Turkic speak- 
ers was added (6), and it persists to the pres- 
ent. Clearly, Anatolia has not been impervious 
to linguistic change during its recorded his- 
tory, and the harbingers of that change are 
also detected genetically, even if as outliers. 
By contrast, the complete absence of Eastern 
hunter-gatherer ancestry in the Chalcolithic 
and Bronze Age either as isolated outliers or 
as a general low-level presence challenges the 
steppe theory to suggest a plausible mecha- 
nism of how a population that made little, if 
any, genetic impact could nonetheless effect 
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large-scale linguistic change. A common vo- 
cabulary for wheeled vehicles is not attested 
for both Anatolian languages and the rest of 
the Indo-European languages (44), thus po- 
tentially removing a technological advantage 
regarded as potentially crucial in the dissem- 
ination of Indo-European languages (45). 

Hypothesis B postulates that Proto-Indo- 
Anatolian was spoken by a population of West 
Asia and the Caucasus with low or no Eastern 
hunter-gatherer ancestry, which affected both 
Anatolia and the steppe. Hypothesis B may 
help to explain the linguistic diversity ob- 
served in Bronze Age Anatolia in which both 
Anatolian (Hittite, Luwian, and Palaic) speak- 
ers, aS well as speakers of other languages 
including Hattic (a non-Indo-European lin- 
guistic isolate of Central/Northern Anatolia) 
and Hurrian [a non-Indo-European language 
from Eastern Anatolia and North Mesopotamia 
related to the later Iron Age Urartian language 
(6)], coexisted. The non-Indo-European Hattic 
language, attested only in Anatolia, would 
most economically represent the linguistic 
substratum, spoken by a population of high 
Anatolian-related ancestry, whereas the Indo- 
European Anatolian languages would be 
spoken by a population of high Caucasus 
hunter-gatherer-related ancestry. The spread 
of people of high Caucasus hunter-gatherer 
ancestry across the peninsula from the east, at 
least some of whom may have spoken early 
forms of Anatolian languages, would simul- 
taneously explain both the genetic homogeni- 
zation before the Late Chalcolithic (Fig. 2) and 
the coexistence of the two linguistic groups. 
How many of the peoples associated with the 
spread of Caucasus hunter-gatherer ancestry 
spoke Anatolian languages? People speaking 
other languages related to the diverse non-Indo- 
European language families of the Caucasus, 
such as Kartvelian and Northwest/Northeast 
Caucasian, may have also participated in the 
westward movements. 

As for the steppe, at least two streams of 
migration from the south (Eneolithic and 
Yamnaya-specific) present the opportunity for 
an early (Chalcolithic) split of Yamnaya lin- 
guistic ancestors from the Anatolian linguistic 
ancestors, followed 1000 to 2000 years later 
by the dispersal of Indo-European languages 
from the steppe with the expansion of the 
Yamnaya culture. Linguistic borrowings (46) 
between Proto-Indo-European and other lan- 
guage families such as Kartvelian (spoken 
primarily in Georgia) could be useful for local- 
izing the Proto-Indo-Anatolian homeland, but 
these may have alternatively come about by 
long-range mobility since the Chalcolithic, prov- 
en by such evidence as the presence of R-V1636 
descendants ~3000 km apart from Khvalynsk 
to Anatolia during this period. Contributions of 
Indo-European to Uralic languages (spoken in 
the forest zone of Eastern Europe and Siberia) 
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appear to have involved only Indo-Iranian 
speakers ~4200 years ago (47). This is impor- 
tant because it constrains the migratory history 
of Proto-Indo-Iranian, consistent with genetic 
evidence (34) that it spread through the steppe 
to South Asia and ruling out the possibility that 
it spread from West Asia to South Asia over the 
Iranian plateau. However, the contribution of 
Indo-Iranian to Uralic languages does not shed 
light on the deeper question of early Indo- 
Anatolian origins. A challenge for the theory 
that Proto-Indo-Anatolian was formed in the 
south in a Caucasus hunter-gatherer-rich pop- 
ulation will be to trace the origins of the auto- 
somal ancestry of the Yamnaya in the Caucasus 
or West Asia [where some existing proposals 
place the Proto-Indo-Anatolian homeland 
(32, 48, 49)] and to identify the place from 
which the R-M269 ancestral lineage expanded, 
because this will be a most plausible secondary 
homeland of Indo-European expansion out- 
side of Anatolia. 

The scenario of a West Asian source of Proto- 
Indo-Anatolian is consistent with a linguistic 
analysis (50) that places the split of Tocharian 
from the remaining (Inner Indo-European) 
languages ~3000 BCE associated with the 
Yamnaya expansion and the disintegration 
of the remaining languages during the 3rd 
millennium BCE, consistent with our infer- 
ences of major steppe admixture into the 
Balkans and Armenia for the subset of Indo- 
European languages of these regions. The 
Anatolian split is placed by that study at 
~3700 BCE (4314 to 3450 BCE, 95% highest 
posterior density interval), a period during 
which the Caucasus hunter-gatherer ancestry 
first appears as far west as the Chalcolithic in- 
dividuals from Northwest Anatolia (at Ihpmar) 
sampled in our study and during which the 
flow of Caucasus hunter-gatherer ancestry into 
the steppe had already commenced. 

Overall, we suggest that a scenario in which 
Anatolian and Indo-European languages are 
descended from a common West Asian pro- 
genitor matches the evidence of population 
change provided by ancient DNA for four 
reasons. First, the genetic transformation of 
Anatolia after the Neolithic and before the 
Late Chalcolithic (Fig. 2) was a clear oppor- 
tunity for linguistic spread resulting in the 
coexistence of Hattic and Anatolian languages. 
Second, the two transformations of steppe 
populations during the Eneolithic and before 
the Bronze Age, with their strong south-north 
directionality (Fig. 3), were opportunities 
for linguistic spread and match exactly the 
Anatolia/Indo-European split inferred by lin- 
guists. Third, steppe migrations into regions 
where Indo-European daughter languages 
were spoken, such as the Balkans (Fig. 4), 
Armenia (Fig. 5), Central/Northern Europe 
(4, 8, 36), and Central/South Asia (4, 34), were 
clear opportunities for the disintegration of 
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Proto-Indo-European and the dispersal of its 
daughter languages across Eurasia. Fourth, the 
absence of such migrations into Anatolia (Fig. 
2F), in contrast to both neighboring Armenia 
and Southeastern Europe [Figs. 4 and 5 and 
(6)], makes Anatolia the only exception in the 
association of steppe ancestry with Indo- 
Anatolian languages. 

This outline of events points toward a con- 
crete research program of investigating the ar- 
chaeological cultures of West Asia, the Caucasus, 
and the Eurasian steppe to identify a popu- 
lation driving transformations of both the 
steppe and Anatolia, linking the two regions. 
The discovery of such a “missing link” (corre- 
sponding to Proto-Indo-Anatolians if our re- 
construction is correct) would bring to an end 
the centuries-old quest for a common source 
binding through language and some ancestry 
many of the peoples of Asia and Europe (47, 57). 
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Literary and archaeological sources have preserved a rich history of Southern Europe and West Asia since the Bronze Age that can be complemented by 
genetics. Mycenaean period elites in Greece did not differ from the general population and included both people with some steppe ancestry and others, 
like the Griffin Warrior, without it. Similarly, people in the central area of the Urartian Kingdom around Lake Van lacked the steppe ancestry characteristic 
of the kingdom’s northern provinces. Anatolia exhibited extraordinary continuity down to the Roman and Byzantine periods, with its people serving as 

the demographic core of much of the Roman Empire, including the city of Rome itself. During medieval times, migrations associated with Slavic and Turkic 


speakers profoundly affected the region. 


he works of ancient writers provide pow- 

erful insights into the ancient world, re- 

cording information on different groups, 

their political organizations, customs, 

relations, and military conflicts. The 
manuscript tradition has been augmented by 
the archaeological record, which also includes 
the discovery of texts of past Mediterranean 
and West Asian civilizations. In this work, we 
leverage the power of ancient DNA to provide 
a third source of information about the people 
inhabiting the states and empires of the past. 
Many of these aspects have been recorded, or 
hinted at, in ancient texts composed close to 
the time of the events they describe. However, 
no text is fully objective, and all texts are in- 
evitably shaped by authors’ biases and world 
views. Ancient DNA provides independent 
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evidence, with its own strengths and weak- 
nesses, and cannot paint a picture of the past 
on its own. Nonetheless, it complements the 
ancient texts and evidence from archaeology. 
By using genetic data, we can hope to obtain a 
more nuanced impression of past processes— 
especially with regards to movements of people 
and biological phenotypes—than would be pos- 
sible without such data. 

This study is a part of a comprehensive ar- 
chaeogenetic analysis of the genetic history of 
the populations of the Southern Arc, spanning 
a trio of papers. For a description of the full 
dataset and analysis framework and charac- 
terization of the population history of the 
Chalcolithic and Bronze Age periods, see (1). 
For analysis of the population history of the 
Neolithic, see (2). The present paper focuses on 


peoples for which there is also information 
from written texts. A main theme is to test the 
extent to which textual insights are supported 
or not supported by the genetic data and 
furthermore to explore what complementary 
information genetics can provide. When we 
reference ancient literature, we use standard 
abbreviations for locating passages in online 
repositories of texts, such as the Perseus Digi- 
tal Library (3). Our study begins at the end of 
the Bronze Age and traces the region’s history 
through the first millennium BCE, through 
the Roman Empire and up to the present, a 
time span of >3000 years. 


The Bronze Age Aegean world 


Previous work has demonstrated that the 
Bronze Age civilizations of Greece of the 
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periods labeled Minoan (on the island of 
Crete; spanning the entirety of the Bronze Age, 
~3500 to 1100 BCE) and Mycenaean (on the 
Greek mainland and its islands; spanning the 
latter part of the Bronze Age since the last phase 
of the Middle Helladic period to the end of the 
Late Helladic period, ~1750 to 1050 BCE) (4) 
were inhabited by genetically similar popula- 
tions that traced most of their ancestry to the 
Neolithic inhabitants of the region, who, in turn, 
were related to the farmers of Anatolia (4-7). 


We refer to people associated with these archae- 
ological contexts as Minoan and Mycenaean, 
recognizing that the people themselves would 
almost certainly not have considered them- 
selves as belonging to two groups divided 
according to this framework defined by ar- 
chaeology and that there was in fact exten- 
sive genetic variation in ancestry among people 
associated with such cultures, as we prove here. 
Both Mycenaeans and Minoans had extra 
“eastern” Caucasus-related ancestry compared 


with that of the Neolithic inhabitants of Greece, 
but they differed from each other in that the 
Mycenaeans taken as a group had some steppe 
ancestry that Minoans lacked (4). In this work, 
we extend the geographical sampling to mul- 
tiple sites without previously reported an- 
cient DNA data, complementing published 
Mycenaean data from the Peloponnese and 
Salamis and Minoan data from Lasithi and 
Moni Odigitria (4). From Crete, we report a 
Middle Minoan individual from Zakros. From 
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Fig. 1. Genetic heterogeneity in the Aegean. (A) A map of Aegean sites. (B) Timeline of Aegean individuals, with vertical jitter added to distinguish contemporaneous 
individuals. BP, before the present. (C to G) Ancestry changes of five components show an increase of Caucasus hunter-gatherer (CHG) and Eastern European 
hunter-gatherer (EHG) ancestry over time and a dilution of Anatolian Neolithic ancestry. (H) During the Minoan and Mycenaean periods of the Bronze Age, Eastern 
European hunter-gatherer ancestry was variable, absent in Minoan individuals of Crete, and present in most but not all Mycenaean individuals of the mainland. 


s.e., Standard error. 


mainland contexts, we report the first Myce- 
naean data from Central Greece—that is, the 
previously unsampled region north of the 
Isthmus of Corinth—including Attica, 
Kastrouli near Delphi in Phokis, and Lokris 
in Phthiotis. South of the Isthmus in the 
Peloponnese, we report data from many 
individuals from the “Palace of Nestor” at 
Pylos and its environs, including the elite 
“Griffin Warrior,” a young (30 to 35 years 
old) male buried in a large stone-built tomb 
with hundreds of precious artifacts, many of 
them made in Minoan Crete (8). 

To contextualize the transformations in 
the Bronze Age Aegean, it is critical to char- 
acterize the pre-Bronze Age genetic landscape 
(Fig. 1). We begin with the Neolithic inhab- 
itants (4, 6, 7), estimating proportions of 
ancestry using a five-source model that we 
developed for Southern Arc Holocene pop- 
ulations (J), which includes as proxies for the 
sources Caucasus hunter-gatherers (9), Eastern 
European hunter-gatherers (5, 10), Levantine 
Pre-Pottery Neolithic (17), Balkan hunter- 
gatherers from the Iron Gates in Serbia (7), 
and Northwestern Anatolian Neolithic from 
Barcin (5). We infer that not only Neolithic 
Greeks from the Peloponnese (7) but also 
those from Northern Greece (6) had ~8 to 10% 
Caucasus hunter-gatherer-related ancestry 
(Fig. 1C). We find small amounts of Caucasus 
hunter-gatherer-related ancestry in South- 
eastern Europe and Neolithic populations in 
general, which is different from the pattern in 
Central/Western Europe where there is none 
(1). This provides proof of multiple streams 
of migration from different Anatolian Neo- 
lithic populations into Europe. 

Both Caucasus and Eastern European hunter- 
gatherer-related ancestry increased in the 
Bronze Age in the Aegean just as the Anatolian- 
related ancestry decreased (Fig. 1), with Myce- 
naean Greeks having 21.2 + 1.3% Caucasus 
hunter-gatherer ancestry and 4.3 + 1.0% East- 
ern European hunter-gatherer ancestry. Given 
the evenly balanced proportions of these com- 
ponents in the Yamnaya and the “high steppe” 
cluster from the Balkans (J), it can be assumed 
that the Eastern European hunter-gatherer 
component in the Aegean was not introduced 
there on its own but rather was accompanied 
by an approximately matching amount of 
Caucasus hunter-gatherer ancestry, thus 
leaving a remainder of ~21.2 - 4.3 = 16.9% 
Caucasus hunter-gatherer. This allows us to 
infer that steppe migrants admixed with a 
population whose composition must have 
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included wo as or ~18.5% Caucasus hunter- 
gatherer-related ancestry. Notably, the esti- 
mated proportion of Caucasus hunter-gatherer 
ancestry in Minoans is virtually identical at 
18.3 + 1.2%. Thus, our analyses resolve the 
question of the origins of the Late Bronze 
Age population by strongly supporting one of 
two previously proposed hypotheses (4)—that 
Mycenaeans were the outcome of admixture 
of descendants of Yamnaya-like steppe mi- 
grants with a Minoan-like substratum, rather 
than the hitherto plausible alternative scenario 
of an Anatolian Neolithic-like substratum ad- 
mixing with an Armenian-like population from 
the east. This alternative scenario is further 
contradicted by the fact that pre-Mycenaean 
period individuals belonging to the Early 
Bronze Age from the islands of the Cyclades 
and Euboea in Southern Greece in ~2500 BCE 
(12) had 21.2 + 1.7% Caucasus hunter-gatherer- 
related ancestry (72), consistent with our 
inferred proportion and providing direct 
evidence for the predicted Minoan-like sub- 
stratum (4). 

The fact that Mycenaeans can be modeled 
as a mixture in an ~1:10 ratio of a Yamnaya- 
like steppe-derived population and a Minoan- 
or Early Bronze Age-like Aegean population 
suggests that any contribution of geographi- 
cally intermediate populations (between the 
steppe and the Aegean) to the formation of 
Mycenaeans was minor. This conclusion is 
further supported by the following: (i) the 
lower (~5%) Caucasus hunter-gatherer ances- 
try in the Neolithic of the Balkans compared 
with the ~20% inferred for the Aegean sub- 
stratum (J), (ii) the near absence of Balkan 
hunter-gatherer (fig. S1) ancestry in the Aegean 
in contrast to other Southeastern European 
populations (~10%) (7), and (iii) the presence 
of Yamnaya-like individuals with minimal 
local ancestry—immediately to the north of 
the Aegean—in Albania and Bulgaria during 
the Early Bronze Age (1). Whatever the genetic 
makeup of people mediating the spread of 
steppe ancestry into the ancestors of Myce- 
naeans, the genetic impact of steppe on Aegean 
populations was quantitatively minor. We es- 
timate the Yamnaya-related steppe ancestry 
proportion in Mycenaeans to be ~% of the 
level of that in the Balkans to the north, ~% 
of that in Armenia in the east, and ~% to % of 
that of populations of Central/Northern Europe 
associated with the Bell Beaker and Corded 
Ware cultures (J). 

Eastern European hunter-gatherer ances- 
try as a marker for Yamnaya steppe pastoralist 


ancestry is absent in a newly reported Middle 
Minoan period individual from Zakros on the 
eastern edge of Crete. This individual’s ances- 
try is generally similar to those previously 
published (13), but with significant Levantine 
admixture (30.5 + 9.1%), which is consistent 
with her either being a migrant to the island 
from the east or part of a structured Cretan 
population whose past ethnic diversity was 
noted as early as the Odyssey of Homer (Hom. 
Od. 19.172-177). 

We show that Eastern European hunter- 
gatherer ancestry was also absent in some 
Mycenaean individuals, which suggests that 
although the contrast between the mainland 
and Crete was significant (fig. S1), the penetra- 
tion of Eastern European hunter-gatherer an- 
cestry did not reach the totality of the mainland 
population during the Late Bronze Age and was 
even significantly variable within Mycenaean 
sites. The Griffin Warrior (8), the earliest indi- 
vidual (~1450 BCE) from the Palace of Nestor 
in Pylos, is genetically right in the middle of the 
general population of the Aegean and was thus 
plausibly of entirely local Aegean origin. He 
had no detectable Eastern European hunter- 
gatherer ancestry (compared with the average 
of 4.8 + 1.1% for the rest of the Mycenaean-era 
individuals sampled at the Palace; Fig. 1H). 
This finding could be consistent with a Cretan 
origin of this individual or his ancestors; alter- 
natively, he could be drawn from a mainland 
population that had not experienced Eastern 
European hunter-gatherer admixture, as could 
two later individuals from Pylos—one buried 
near the Palace in a chamber tomb and another 
in a cist grave. Variation in Eastern European 
hunter-gatherer ancestry is observed at short 
geographical distance scales and within the 
same time periods: We observe that four in- 
dividuals (~1450 BCE) of the sample from 
Attica buried at Kolikrepi-Spata had only 
2 + 1% Eastern European hunter-gatherer an- 
cestry that was significantly less (by more than 
two standard errors) than that of individuals 
from the neighboring island of Salamis and all 
sampling locations in the Peloponnese. This 
suggests that the classical Athenian claim (e.g., 
Plat. Menex. 237b) of having received fewer 
migrants than other Greek poleis in the re- 
mote past may have had an element of truth, 
although larger sample sizes will be neces- 
sary to definitively establish such geographic 
patterns. 

Northern migrants made an impact through- 
out mainland Greece, even if it was a modest 
one. This is also attested in the male line, for 
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Fig. 2. The Kingdom of Urartu and its neighbors. (A to D) Comparisons of ancestry in four ancestral components [SRB_lron_Gates_HG, the fifth component of 

the model of (1) is negligible]. This analysis shows a stark contrast between Armenia and the other populations in terms of Eastern European hunter-gatherer ancestry 
(B) and between Van and Assyrian Mesopotamia (represented by the site of Nemrik 9 in Iraq) in terms of Levantine ancestry (C). When unlabeled individuals are ordered 
in increasing Eastern European hunter-gatherer ancestry (E), Assyrian Mesopotamia and Van lack this ancestry (except for an outlier individual from Van), whereas 
individuals from Armenia mostly have it, and those from Hasanlu have a limited range from zero Eastern European hunter-gatherer ancestry to a maximum level that is 
less than that seen in Armenia. 
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Fig. 3. The Roman Empire, east and west. (A) The Imperial period Romans 
from the vicinity of the city of Rome in Central Italy resembled Roman-Byzantine 
Anatolians in their average admixture proportions [95% confidence interval 

(Cl) of +1.96 standard errors shown as boxes, and a heteroskedastic Gaussian 
process is fitted to unlabeled Italian and Anatolian individuals; dashed lines 
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Romans can be drawn from the same distribution as Roman-Byzantine ones 
(P = 0.19) but are significantly different (P < 2.16 x 10°) from all other periods 
of Italy. (€) Hierarchical clustering of raw ancestry estimates of diverse 
individuals shows overlapping distributions of Imperial Roman and Anatolian 
Roman-Byzantine individuals (black) without knowledge of their ancestry 
labels and differentiated from the distributions of Southeastern Europe, 


indicate 5% and 95% quantiles]. (B) P values of the 
multivariate two-sample test (45) for pairs of populat 


example, by a Y chromosome match of the 
rare R-PF7562 haplogroup between a pair of 
patrilineal relatives from the Palace of Nestor, 
which links Late Bronze Age Mycenaean Greece 
with an Early Bronze Age individual of the 
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Baringhaus-Franz 
ions indicate that Imperial 


North Caucasus at Lysogorskyja that is 
genetically similar to the Yamnaya (/4). This 
patrilineal connection to the Yamnaya should 
not be interpreted as a general association of 
steppe ancestry with elite burial status, as 


Armenia, and the Levant. 


the common people, making up most of our 
Mycenaean-era individuals, also had steppe 
ancestry, whereas some members of the elite 
(such as the Griffin Warrior) did not have sig- 
nificant evidence of it. A parallel example of an 
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elite individual with less steppe ancestry than 
others from the same cultural context during a 
period of steppe ancestry spread is given by the 
“Amesbury Archer,” the most well-furnished 
grave in the Stonehenge mortuary landscape 
of Great Britain (15). These two examples 
highlight the pitfalls of conflating genetic 
ancestry with narratives of social dominance. 
Whatever the social role of early steppe mi- 
grants into the Aegean, they did not establish 
a system that precluded admixture with locals 
or prevented them from rising to positions 
of power. This inclusiveness may explain the 
substantial dilution of steppe ancestry in the 
Aegean, as migrants and locals blended to 
form the ancestors of the Mycenaean-era popu- 
lation, and may shed light on the genesis of the 
Greek language linked, on one hand, with the rest 
of Indo-Europeans through steppe ancestry (7) 
and, on the other, with the people of the Aegean 
who preceded the Proto-Greek speakers (/6). 

One of the two patrilineal relatives at Pylos 
(113518) was almost certainly the offspring 
of first cousins; we document such close-kin 
unions not only in elite Mycenaean society but 
also in different localities of the Bronze Age 
Southern Arc (fig. S2) (17), including an indi- 
vidual from Bezdanjaéa in Croatia (118717) 
who was likely the offspring of an uncle-niece 
pairing. This documents the later persistence 
of the practice of close-kin matings that had 
started with the Neolithic (78, 19), although 
whether this is the result of the burials we 
analyzed being a biased subset of a population 
or reflects society-wide cultural preferences 
cannot be resolved with our available sample. 
Did descriptions of such unions in classical 
mythological accounts of the “Heroic Age” 
reflect practices that persisted to the authors’ 
own time? Ancient DNA studies from more 
locations would allow these patterns of mating 
preferences inferred from a handful of sites to 
be characterized at higher resolution. 


The era of Greek colonization 


We report a preliminary look at demographic 
patterns associated with the Greek colonial 
period (eighth to sixth centuries BCE) by 
identifying individuals from both the South- 
ern Arc and outside of it who were geneti- 
cally similar to Bronze Age individuals of the 
Mycenaean period (supplementary text S1 and 
fig. S3) (17). This identifies an Archaic period 
individual from Kastrouli near Delphi in Phokis 
on the Greek mainland and individuals at 
Empuries in Northeastern Spain who are genet- 
ically very similar to Mycenaean-era individu- 
als from the Greek mainland (20). Empitries 
was an outpost colonized by Phocaeans from 
Western Anatolia, who were themselves said 
to be colonists from Phokis (Paus. 7.3.10). Thus, 
we capture the end points of a long chain of 
transmission, with little admixture, across 
the Mediterranean. Could the ancestry of the 
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Emptries individuals be traced back to the 
beginning of this chain, or was it drawn from 
another genetically similar source? Although 
we do not yet have rich sampling of the peoples 
of the Greek colonial world, systematic sam- 
pling of diverse Greek colonies spread over the 
Mediterranean and Black Sea coasts would 
make it possible to systematically test for evi- 
dence of specific metropolis-colony connections 
and document the extent to which migration, 
admixture with local populations, and genetic 
heterogeneity played a role in Greek colonization. 

Ancestry typical of the Mycenaean period 
also spread to the Eastern Mediterranean, as 
in the case of an individual from Ashkelon as- 
sociated with a Philistine archaeological con- 
text (21). We also show the similarity of some 
individuals from inland Thrace (at Kapitan 
Andreevo) with the Mycenaean genetic profile, 
which suggests that Mycenaeans were genet- 
ically similar to some Thracians from the East 
Balkans, outside the sphere of the Late Bronze 
Age Aegean. This provides a cautionary tale 
highlighting the dangers of conflating genetic 
and cultural similarity. 

The coastal regions of Anatolia formed an- 
other area of Greek settlement, and much of 
the Anatolian peninsula was incorporated into 
the Hellenistic kingdoms established by the 
successors of Alexander the Great, provid- 
ing opportunity for population transfer from 
Southeastern Europe to Anatolia. Yet, we do 
not find Mycenaean-like individuals either at 
first millennium BCE Greek colony sites, such 
as Halicarnassus (modern Bodrum), or Amisos 
(modern Samsun) in the Aegean and Black 
Sea regions, respectively. This pattern is quali- 
tatively different from that at Emparies in 
Iberia and is consistent with the account of 
Herodotus that early Greek colonists of Anatolia 
married indigenous Carian women of Anatolia 
when they first settled there (Hdt. 1.146). It is 
also reminiscent of the marriages of Alexander 
himself and his companions with local women 
of the conquered Persian Empire (Arr. An. 
7.4.4ff). Clearly, Greeks segregated them- 
selves socially and reproductively from non- 
Greeks in some parts of the Greek world and 
not in others; an important topic for future 
research is to identify the factors that cor- 
related with Greeks mixing with peoples from 
local communities. 


The Urartian Kingdom and its neighbors in 
lran and Mesopotamia 


We have already seen how the Aegean was 
an area of limited Eastern European hunter- 
gatherer penetrance that nonetheless differ- 
entiates it from neighboring Anatolia, where 
Eastern European hunter-gatherer ancestry 
was negligible (7). An even more notable case 
is that of the Iron Age Kingdom of Urartu, 
situated in the mountainous and geograph- 
ically fragmented regions of eastern Turkey 


and Armenia, where the linguistic landscape 
must have been complex in the Bronze and 
Iron Ages. The people at the center of this 
kingdom in the Lake Van region of Turkey 
(Cavustepe) and its northern extension in 
Armenia were strongly connected by material 
culture and were buried only ~200 km apart, 
yet they formed distinct genetic clusters with 
little overlap during the kingdom’s early (ninth 
to eighth centuries BCE) period (Fig. 2). The 
Van cluster is in continuity with the pre- 
Urartian population (~1300 BCE) at neighbor- 
ing Muradiye also in the Van region and is 
characterized by more Levantine ancestry and 
the absence of steppe ancestry. It contrasts 
with the cluster of Urartian period individ- 
uals from Armenia, who have less Levantine 
and some steppe ancestry, like the pre-Urartian 
individuals of the Early Iron Age (J). Our ge- 
netic results help to explain the formation of 
linguistic relationships in the region. Popula- 
tion continuity of the Lake Van core popula- 
tion with greater Levantine ancestry may well 
correspond to the Hurro-Urartian language 
family (22) that linked the non-Indo-European 
Urartian language of the kingdom with the 
earlier Bronze Age Hurrian language, whose 
more-southern distribution encompassed parts 
of Syria and Northern Mesopotamia. Into the 
periphery of this Hurro-Urartian linguistic 
sphere came a steppe-admixed population 
from the north, whose presence marks the 
southern edge of steppe expansion that we 
discussed above and whose proximity to the 
Urartian speakers would provide a mechanism 
for the incorporation of Urartian words into 
the Armenian lexicon. 

When we compare (Fig. 2E) the Urartian 
individuals with their neighbors at Iron Age 
Hasanlu in Northwestern Iran (~1000 BCE), 
we observe that the Hasanlu population had 
some Eastern European hunter-gatherer an- 
cestry but to a lesser degree than their con- 
temporaries in Armenia. The population was 
also linked to Armenia by the presence of the 
same R-M12149 Y chromosomes (within haplo- 
group Rib), linking it to the Yamnaya pop- 
ulation of the Bronze Age steppe (1). Which 
language was spoken in this case is not clear, 
but the population shows no connection with 
the high-Eastern European hunter-gatherer 
R-Z93 (within haplogroup Rla) haplogroup- 
bearing groups from Central and South Asia 
belonging to steppe populations ancestral 
to Indo-Aryan speakers (23)—the closest lin- 
guistic relatives of Iranian speakers (24). 
Present-day Iranians do have R-Z93 Y chro- 
mosomes (25) or the more general upstream 
Rla-M17 ones [observed in every one of 19 
diverse populations from Iran (26), as well 
as in present-day Indians (27), and modern 
Iranians almost completely lack Rib Y chromo- 
somes (<1% frequency) ]. Thus, it appears that 
Ria haplogroup Y chromosomes represent a 
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Genetic distance (cM) 


Fig. 4. Central Asian Turkic admixture in Anatolia. (A) Individuals from 
Capalibag (1300 to 1650 CE) and present-day Turkish individuals are intermediate 
between Byzantine Anatolia and 500 to 1500 CE Central Asians along a global 
principal components analysis distinguishing West from East Eurasians (left-to- 
right on the horizontal dimension; noise added on the vertical dimension to 
distinguish points). (B) Two-way unsupervised ADMIXTURE analysis of eastern 
ancestry: Byzantine (0%); present-day Turkish (9%); Gapalibag (18%); and 


common link between ancient and modern 
Indo-Iranians, whereas Rib haplogroup Y 
chromosomes (to which many of the Hasanlu 
males belonged) do not. The absence of any 
Rla examples among 16 males at Hasanlu, 
who are instead patrilineally related to indi- 
viduals from Armenia, suggests that a non- 
Indo-Iranian (either related to Armenian or 
belonging to the non-Indo-European local 
population) language may have been spoken 
there and that Iranian languages may have 
been introduced to the Iranian plateau from 
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Central Asia only in the first millennium. 
Finally, a single individual from the Late 
Bronze Age of Assyrian Northern Mesopotamia 
(~1250 BCE) resembled the Urartian Van in- 
dividuals in lacking Eastern European hunter- 
gatherer ancestry, had the highest amount 
of Levantine autosomal ancestry (42.8 + 5.3%), 
and had a J-P58-derived Y chromosome with 
strong Levantine geographical associations 
(1) and may have plausibly been a speaker of a 
Semitic language, such as those that have 
been spoken and recorded in the region for 


Genetic distance (cM) 


Central Asian individuals, who differ between 100% (in Mongolia) and 43% 
(some ancient populations of Kazakhstan and Kyrgyzstan). (C) Individuals from 
Capalibag in Turkey admixed 12.2 + 1.4 generations (342 + 39 years) before their 
time using Byzantine Anatolians and Central Asians (from 500 to 1500 CE) as 
sources. cM, centimorgan. (D) Present-day Turkish people genotyped on the 
Human Origins array (34) admixed 30.6 + 1.9 generations ago (857 + 53 years ago) 
using the same sources as in (D). 


most of its history. Archaeology and historical 
texts have furnished a wealth of information 
about the political geography of the ancient Near 
East, and future genetic studies will elucidate 
changes of population that occurred either due 
to voluntary migration or forced movements of 
peoples implemented by state policies. 


The Anatolian origins of the population of the 
Roman-Byzantine Empire 


A paleogenomic time transect of the city of 
Rome in Central Italy (28) identified an ancestry 
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shift toward the Near East during the Imperial 
period (27 BCE to 300 CE) but was unable to 
localize the origin of the migrants driving this 
phenomenon. We sought to identify the geo- 
graphic sources of these Imperial-era Romans 
by coanalyzing the data from Italy with data 
from the Southern Arc. Unexpectedly, the 
ancestry of the sample of people whose 
genomes were analyzed who lived around 
Rome in the Imperial period was almost iden- 
tical to that of Roman and Byzantine individ- 
uals from Anatolia in both their mean (Fig. 3A) 
and pattern of variation (Fig. 3B), whereas 
Italians before the Imperial period had a very 
different distribution (28, 29). We clustered 
diverse Roman, Byzantine, and medieval indi- 
viduals and their immediate predecessors with- 
out any knowledge of their population labels 
and found that the Italian and Anatolian in- 
dividuals clustered together with those of pre- 
Roman Anatolia, whereas pre-Imperial people 
around the city of Rome were systematically 
different (Fig. 3C). This suggests that the 
Roman Empire in both its shorter-lived 
western part and the longer-lasting eastern 
centered on Anatolia had a diverse but similar 
population plausibly drawn, to a substantial 
extent, from Anatolian pre-Imperial sources. 
In an irony of history, although the Roman 
Republic prevailed in its existential military 
struggle against the Anatolians rallied by 
Mithridates VI of Pontus during the first cen- 
tury BCE, the final incorporation of Anatolia 
into the Roman Empire and the increased con- 
nectivity that ensued may have set the stage 
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for the very same Anatolians to become the 
demographic engine of Imperial Rome itself. 
This recreated, in historical time, the mythical 
journey of Aeneas and his Trojan exiles from 
Anatolia to the shores of Italy. 

The Southern Arc was also a recipient of 
many immigrants from outside the region in 
the Historical period, such as two individuals 
sampled in Samsun in the Black Sea region 
from the Roman era in the second to third 
centuries CE (17). These individuals have both 
Eastern European hunter-gatherer and some 
East Eurasian ancestry that contrasts them 
with the local population of the Black Sea re- 
gion that had been stable since the Chalcolithic 
(30), across the Early Bronze Age transi- 
tion at Amasya, and down to the time of 
the Kingdom of Pontus (first century BCE). 
Broad genetic stability in Anatolia during 
the Roman-Byzantine period did not mean 
isolation, as outliers of likely Levantine, North- 
ern European or Germanic, and Iberian origin 
are detected in the Marmara region (in the 
Basilica of Nicaea or present-day Iznik and 
the Virgin Mary Monastery at Zeytinliada, 
Erdek) close to the Imperial capital of Con- 
stantinople (present-day Istanbul), which may 
have attracted a more diverse set of foreigners. 
Other outliers are found at the periphery of 
the Southern Arc in the Iron Age, in Moldova 
and Romania, long after the early steppe 
migrants previously discussed. These are dis- 
tinctive because of the East Eurasian admix- 
ture of Central Asian Scythian individuals 
(31-33). 


Fig. 5. Byzantine and medieval 
Southeastern Europe. We sorted 
admixture proportions of Anatolian 
Neolithic ancestry to investigate 

the dilution of this ancestry in 
present-day populations from South- 
eastern Europe. Roman-, medieval-, 
and Byzantine-era individuals are 

all indicated in bold. During the Bronze 
Age, the range of this ancestry was 
immense, as observed in (1), but 
present-day people from the Balkans 
have less of this ancestry than was 
the case in individuals from the 
Bronze Age through the Iron Age 
and down to classical antiquity 
(Ancient). Medieval and Byzantine 
people from the Balkans were diverse, 
with some (right) continuing the 
ancient pattern of high Anatolian 
Neolithic ancestry, several (middle) 
overlapping with the range of 
present-day people, and some (left) 
having as little of such ancestry as 
present-day Balto-Slavic people 
from Eastern Europe. 


Medieval migrations into Anatolia 

and the Balkans 

East Eurasian ancestry also helps identify a 
noteworthy set of outliers at Capalibag in the 
Aegean coast of Turkey dating from the 14th 
to 17th centuries (Fig. 4) (17). These have ~18% 
such ancestry, unlike Byzantine-era individu- 
als from Turkey (Fig. 4B), which suggests a 
Central Asian influence. An admixture date 
estimate of 12.2 + 1.4 generations before their 
time using Roman-Byzantine and Central 
Asian sources (Fig. 4C) suggests that the ad- 
mixture occurred in the period surrounding 
the 11th century arrival and expansion of 
Seljuq Turks to Anatolia. Present-day Turkish 
individuals have an admixture date estimate 
of 30.6 + 1.9 generations (Fig. 4D) and thus 
from the same early centuries of the 1000s CE, 
which coincided with the transfer of control of 
Anatolia from the Romans to the Seljugqs and 
eventually the Ottomans. The genetic con- 
tribution of Central Asian Turkic speakers to 
present-day people can be provisionally esti- 
mated by comparison of Central Asian ancestry 
in present-day Turkish people (~9%) and sam- 
pled ancient Central Asians (range of ~41 to 
100%) to be between ;8, and 2, or ~9 to 22%. 
Our sample of present-day Turkish people is 
broadly representative of the general popula- 
tion, as it derives from eight localities across the 
country (n = 58) (34). The genetic data point to 
Turkish people carrying the legacy of both an- 
cient people who lived in Anatolia for thousands 
of years covered by our study and people coming 
from Central Asia bearing Turkic languages. 
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Fig. 6. Pigmentation in West Eurasia. (A to C) We show the temporal distribution 
of genetically predicted eye (A), skin (B), and hair (C) color in West Eurasians 

of the last 16,000 years; each point represents an individual, with the top row for 
each subphenotype corresponding to the Southern Arc and the bottom row 
corresponding to Northern, Central, and Western Europeans and people of the 
Eurasian steppe. ky BP, thousand years before the present. (D) Composite 
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phenotypes of all three aspects of pigmentation using the same color scheme as 
(A) to (C) and denoted as eye color (circle), hair color (top), and skin color 
(bottom) in the composite phenotype symbols. The modal phenotype of West 
Eurasians had brown eyes, intermediate skin pigmentation, and brown hair, 
with the highest prevalence (Fisher's exact test) of low pigmentation outside 
the Southern Arc (in the rest of Europe and the Eurasian steppe). 


The medieval period was marked by Slavic 
migrations into the Balkans on the basis of 
the genetic analysis of present-day popu- 
lations (35, 36). It is also recorded in historical 
sources, such as those of Procopius (37) in 
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the sixth century BCE, when Slavic groups 
came into contact with the Roman Empire 
(38). The South Slavs of today in the Balkans 
are one of the major groups of Slavic speakers, 
and the question of which migrations played 


a role in their origin is of interest for under- 
standing how this group of languages, little- 
attested until medieval times, came to be so 
widespread across the greater part of Eastern 
Europe. We highlight Roman, Byzantine, and 
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medieval individuals from Albania, Bulgaria, 
Croatia, Greece, North Macedonia, and Serbia, 
which we studied in conjunction with those 
that preceded them in the Balkans and with 
published data from present-day people geno- 
typed on the Human Origins array (34, 39) 
(Fig. 5). The reduction of Anatolian Neolithic 
ancestry was a long-term process in Southeast- 
ern Europe (2), which allows us to differentiate 
present-day populations from those preceding 
the Slavic migrations. When we order individ- 
uals along this component of ancestry (Fig. 5), 
we observe that present-day Slavs outside the 
Balkans have the least, whereas pre-Slavic in- 
habitants from the Balkans have the most of 
this type of ancestry, with present-day people 
from Southeastern Europe intermediate be- 
tween the two extremes. Three individuals 
from Bulgaria (Samovodene), North Macedonia 
(Bitola), and an outlier individual from Trogir 
in Croatia (700 to 1100 CE) have the lowest 
levels of this ancestry. Most individuals from 
Trogir (a port city of the Adriatic in Croatia 
that was founded by Ancient Greek colonists 
and was part of the Byzantine Empire) over- 
lapped with present-day people from ~700 to 
900 CE, as did 12th century CE individuals 
from Veliko Tarnovo and Ryahovets in Bulgaria 
and a mid-fourth century CE Roman-era 
individual from Marathon in Greece, who, 
however, lacked the Balkan hunter-gatherer 
ancestry found consistently in the present-day 
population (Fig. 1). Finally, three medieval 
individuals from Albania (500 to 1100 CE) 
and a Late Antique (~500 CE) individual from 
Boyanovo in Bulgaria preceding the Slavic mi- 
grations, overlapped with the more ancient 
population, having high levels of Anatolian 
Neolithic ancestry. Among present-day people, 
Greeks and Albanians have more Anatolian 
Neolithic ancestry than their South Slavic 
neighbors. Slavic migrations have some echoes, 
~3000 years later, to the spread of the de- 
scendants of Yamnaya steppe pastoralists 
into Southeastern Europe (J, 7). Although both 
events were transformative, any analogy should 
not be pushed too far. The medieval movements 
were carried out by large, organized communi- 
ties engaging with complex states, such as the 
Avar Khaganate and Byzantine Empire, and no 
comparable polities existed in Yamnaya times. 
Collectively, our data suggest that although 
Balkan groups experienced a shift of ancestry 
in the medieval period, the fusion of locals and 
migrants was variable with individuals of di- 
verse ancestry being present in medieval times 
and persisting up to the present. 


Phenotypes of the Southern Arc in their West 
Eurasian context 


Our survey of populations of the Southern Arc 
focuses on ancestry, but it also illuminates 
other aspects of biology. Superficial pheno- 
types, such as pigmentation, were remarked 
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upon by ancient writers. We carried out a 
survey of predicted pigmentation and other 
phenotypes of West Eurasian populations 
across time (supplementary text S3 and Fig. 6) 
(17) to discover the extent to which ancient 
authors’ perceptions (based on direct observa- 
tion or through accounts of faraway peoples) 
might correspond to the genetic inference 
of their appearances (40). We find that the 
modal phenotype of eye, skin, and hair pig- 
mentation in ancient West Eurasians was 
brown-eyed, of intermediate complexion, and 
brown-haired—even among Yamnaya steppe 
pastoralists—contradicting stereotypical char- 
acterizations of Steppe peoples as being blue- 
eyed, pale-skinned, and light-haired (41, 42). 
Note that when we use categorizations—such 
as intermediate—of the continuous skin tone 
phenotype, we refer to the scheme adopted by 
HlrisPlex-S (40); in that scheme, intermediate 
skin tones are commonly found in present-day 
Mediterranean populations and pale ones in 
present-day Northern European ones. A gen- 
eral depigmentation trend can be seen across 
time (Fig. 6), with a reduction of black hair 
and darker skin tones accompanying the in- 
crease of brown hair and intermediate skin 
tones. However, inhabitants of the Southern 
Arc had significantly darker pigmentation on 
average than those of the north (defined as 
Europe outside the Southern Arc and the 
Eurasian steppe) over all periods (Fig. 6), which 
provides support for the identification by 
ancient writers of light-pigmentation pheno- 
types as being more common in some groups 
of the north, such as the Celts and Scythians. 
Another contrast made by ancient writers was 
with people of Africa, such as Egyptians and 
Ethiopians, who were said to be of darker 
pigmentation (e.g., Hdt. 2.104); a compari- 
son of people of the Southern Arc with their 
southern neighbors will become possible when 
genomic data from people living south of the 
Mediterranean become available. When exam- 
ining composite pigmentation phenotypes 
(Fig. 6D), we observe that although average 
pigmentation did differentiate between pop- 
ulations of the Southern Arc and the north, 
light phenotypes were found in both areas at 
similar early dates, growing in parallel in the 
more recent millennia of history. Light pig- 
mentation in West Eurasia was the result of 
selection across time, which continued into 
the Historical period (43, 44), and not of the 
survival of supposed ancient Indo-European 
phenotypes as some 19th and 20th century 
writers supposed (41, 42) or the product of the 
direct influence of climate that some Greco- 
Roman writers hypothesized to explain pat- 
terns they observed during their own time 
(17). The malleability of human phenotypes 
across time and the presence of diverse ones— 
whether dark, light, or intermediate—across 
space undermine prejudiced views of history 


that overemphasize superficial traits at the 
expense of the more meaningful aspects of 
human culture and biology. 

This study illustrates the potential of ar- 
chaeogenetic studies of people of the civiliza- 
tions of the ancient world in conjunction with 
archaeological and textual evidence. Ancient 
writings are replete with the descriptions of 
little-known groups, such as the numerous 
tribes encountered by Xenophon the Athenian 
at the end of the fifth century BCE and re- 
corded in his Anabasis, as he and his fellow 
mercenaries escaped from Mesopotamia north- 
ward to the Black Sea. To what extent did these 
and other named entities of antiquity corre- 
spond to ancestral groups that may one day be 
placed on the genetic landscape of the ancient 
world? Ancient DNA is bringing some of the 
stories of these forgotten peoples back to life 
and paying homage to their legacies. 
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Jonathan E. Pekar*®, Stephen A. Goldstein®, Angela L. Rasmussen”*, Moritz U. G. Kraemer®, 
Chris Newman’®, Marion P. G. Koopmans”"”, Marc A. Suchard’*"*5, Joel O. Wertheim’®, 
Philippe Lemey’”"®, David L. Robertson’?, Robert F. Garry’®2°71, Edward C. Holmes“, 


Andrew Rambaut”’, Kristian G. Andersen?2+* 


Understanding how severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in 2019 is 
critical to preventing future zoonotic outbreaks before they become the next pandemic. The Huanan 
Seafood Wholesale Market in Wuhan, China, was identified as a likely source of cases in early 

reports, but later this conclusion became controversial. We show here that the earliest known 
COVID-19 cases from December 2019, including those without reported direct links, were geographically 
centered on this market. We report that live SARS-CoV-2-susceptible mammals were sold at the 
market in late 2019 and that within the market, SARS-CoV-2-positive environmental samples were 
spatially associated with vendors selling live mammals. Although there is insufficient evidence to 
define upstream events, and exact circumstances remain obscure, our analyses indicate that the 
emergence of SARS-CoV-2 occurred through the live wildlife trade in China and show that the Huanan 


market was the epicenter of the COVID-19 pandemic. 


n 31 December 2019, the Chinese gov- 
ernment notified the World Health Or- 
ganization (WHO) of an outbreak of 
severe pneumonia of unknown etiology 

in Wuhan, Hubei Province, China (7-4), 

a city of ~11 million people. Of the initial 41 
people hospitalized with unknown pneumo- 
nia by 2 January 2020, 27 (66%) had direct 
exposure to the Huanan Wholesale Seafood 
Market (hereafter, “Huanan market”) (2, 5, 6). 
These first cases were confirmed to be infected 
with a novel coronavirus, subsequently named 
severe acute respiratory syndrome corona- 
virus 2 (SARS-CoV-2), and were suffering from 
a disease later named coronavirus disease 
2019 (COVID-19). The initial diagnoses of 
COVID-19 were made in several hospitals 
independently between 18 and 29 December 
2019 (5). These early reports were free from 
ascertainment bias because they were based 
on signs and symptoms before the Huanan 
market was identified as a shared risk factor 
(5). A subsequent systematic review of all cases 
reported to China’s National Notifiable Dis- 
ease Reporting System by hospitals in Wuhan 
as part of the joint WHO-Chinese “WHO- 
convened global study of origins of SARS- 
CoV-2: China Part” (hereafter, “WHO mission 
report”) (7) showed that 55 of 168 of the ear- 
liest known COVID-19 cases were associated 
with this market. However, the observation 
that the preponderance of early cases were 
linked to the Huanan market, alone, does not 
establish that the pandemic originated there. 
Sustained live mammal sales during 2019 
occurred at the Huanan market and three 
other markets in Wuhan, and included wild 
and farmed wildlife (8). Several of these species 
are known to be experimentally susceptible 


to SARS-related coronaviruses (SARSr-CoVs) 
such as SARS-CoV (hereafter, “SARS-CoV-1”) 
and SARS-CoV-2 (9-11). During the early stages 
of the COVID-19 pandemic, animals sold at the 
Huanan market were hypothesized to be the 
source of the unexplained pneumonia cases 
(12-19) (data S1), consistent with the emer- 
gence of SARS-CoV-1 from 2002 to 2004 (20), 
as well as other viral zoonoses (27-23). This led 
to the decision to close and sanitize the Huanan 
market on 1 January 2020, with environmental 
samples also being collected from vendors’ 
stalls (7, 12, 24) (data S1). 

Determining the epicenter of the COVID-19 
pandemic at the neighborhood level rather than 
at the city level could help to resolve whether 
SARS-CoV-2 had a zoonotic origin, similar to 
SARS-CoV-1 (20). In this study, we obtained data 
from a range of sources to test the hypothesis 
that the COVID-19 pandemic began at the 
Huanan market. Despite limited testing of live 
wildlife sold at the market, collectively, our re- 
sults provide evidence that the Huanan market 
was the early epicenter of the COVID-19 pandem- 
ic and suggest that SARS-CoV-2 likely emerged 
from the live wildlife trade in China. However, 
events upstream of the market, as well as exact 
circumstances at the market, remain obscure, 
highlighting the need for further studies to under- 
stand and lower the risk of future pandemics. 


Results 
Early cases lived near to and centered on the 
Huanan market 


The 2021 WHO mission report identified 174 
COVID-19 cases in Hubei Province in December 
2019 after careful examination of reported case 
histories (7). Although geographical coordinates 
of the residential locations of the 164 cases who 
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lived within Wuhan were unavailable, we were 
able to reliably extract the latitude and longi- 
tude coordinates of 155 cases from maps in the 
report (figs. S1 to $8). 

Although early COVID-19 cases occurred 
across Wuhan, most clustered in central Wuhan 
near the west bank of the Yangtze River, with a 
high density of cases near to, and surrounding, 
the Huanan market (Fig. 1A). We used a kernel 
density estimate (KDE) to reconstruct an under- 
lying probability density function from which 
the home locations for each case were drawn 
(25). Using all 155 of the December 2019 cases, 
the location of the Huanan market lies within 
the highest density contour that contains 1% 
of the probability mass (Fig. 1B). For a KDE 
estimated using the 120 cases with no known 
linkage to the market, the market remains 
within the highest density 1% contour (Fig. 1C). 
The clustering of COVID-19 cases in December 
around the Huanan market (Fig. 1, B and C, 
insets) contrasts with the pattern of widely 
dispersed cases across Wuhan by early January 
through mid-February 2020 (Fig. 1, D and E), 
which we mapped using location data from 
individuals who had used a COVID-19 assis- 
tance channel on Sina Weibo, a Chinese social 
media platform (26). Weibo-based data analyses 
showed that, unlike early COVID-19 cases, by 
January and February, many of the sick indi- 
viduals who sought help resided in highly pop- 
ulated areas of the city, particularly in areas 
with a high density of older people (Fig. 1E and 
figs. S9 and S10). 

We also investigated whether the December 
COVID-19 cases were closer to the market than 
expected based on an empirical null distribu- 
tion of Wuhan’s population density [data from 
WorldPop.org (27, 28)], with a median distance 
to the Huanan market of 16.11 km (25). To ac- 
count for older individuals being more likely to 
be hospitalized and sick with COVID-19 (29), 
we age-matched the population data to the 
December 2019 COVID-19 case data. We con- 
sidered three categories of cases, which were 
all significantly closer to the Huanan market 
than expected: (i) all cases (median distance 
4.28 km; P < 0.001), (ii) cases linked directly to 
the Huanan market (median distance 5.74 km; 
P < 0.001), and (iii) cases with no evidence of 


a direct link to the Huanan market (median 
distance 4.00 km; P < 0.001) (Fig. 2A). The 
cases with no known link to the market on 
average resided closer to the market than the 
cases with links to the market (P = 0.029). 
Furthermore, the distances between the center 
points (Fig. 2B) and the Huanan market were 
shorter than expected for all categories of 
December cases compared with the empirical 
null distribution of Wuhan’s population den- 
sity (Fig. 2A). For all December cases, the 
center point was located 1.02 km away (P = 
0.007); for cases with market links, it was 
2.28 km away (P = 0.034); and for the cases 
with no reported link to the market, it was 
0.91 km away (P = 0.006). By comparison, 
the center point of age-matched samples drawn 
from the empirical null distribution was 
4.65 km away from the market (Fig. 2A). 

We tested the robustness of our results for 
the possibility of ascertainment bias (25). For 
all mapped cases (7 = 155), under the “center- 
point distance to the Huanan market” test, the 
38 cases residing closest to the market (within 
a radius of 1.6 km) could be removed from the 
dataset before losing significance at the a = 
0.05 level (fig. S12). For the “median distance 
to Huanan market” test, we could remove 98 
cases (63%) (r = 5.8 km). For cases not directly 
linked to the Huanan market (n = 120), we 
could remove 36 (30%) (r = 1.5 km) and 81 
(68%) (r = 4.3 km) cases for the two tests, 
respectively, before losing significance at the 
a = 0.05 level (fig. S12). 

We performed a spatial relative risk analysis 
(25) to compare December 2019 COVID-19 cases 
with January-February 2020 cases reported 
through Weibo (Fig. 2C). The Huanan market 
is located within a well-defined area with high 
case density that would be expected to be ob- 
served in <1 in 100,000 samplings of the Weibo 
data empirical distribution (the relative risk 
analysis is shown in Fig. 2C and the control 
distribution in Fig. 1D). No other regions in 
Wuhan showed a comparable case density. 


Both early lineages of SARS-CoV-2 were 
geographically associated with the market 


Two lineages of SARS-CoV-2 designated A and 
B (80) have co-circulated globally since early in 


the COVID-19 pandemic (37). Until a report in 
a recent preprint (24), only lineage B sequen- 
ces had been sampled at the Huanan market. 
The 11 lineage B cases from December 2019 for 
which we have location information resided 
closer than expected to the Huanan market 
compared with the age-matched Wuhan pop- 
ulation distribution (median distance 8.30 km; 
P = 0.017) (25). The center point of the 11 line- 
age B cases was 1.95 km from the Huanan mar- 
ket, also closer than expected (P = 0.026). The 
two lineage A cases for which we have location 
information involved the two earliest lineage A 
genomes known to date. Neither case reported 
any contact with the Huanan market (7). The 
first case was detected before any knowledge 
of a possible association of the unexplained 
pneumonia in Wuhan with the Huanan mar- 
ket (5), and therefore could not have been a 
product of ascertainment bias in favor of cases 
residing near the market. The second case had 
stayed in a hotel near the market (32) for the 
5 days preceding symptom onset (25). Relative 
to the age-matched Wuhan population distri- 
bution, the first individual resided closer to 
the Huanan market (2.31 km) than expected 
(P = 0.034). Although the exact location of the 
hotel near the market was not reported (32), 
there are at least 20 hotels within 500 m (table 
S1). Under the conservative assumption that 
the hotel could have been located as far as 
2.31 km from the Huanan market (as was the 
residence of the other lineage A case), and 
assuming that this location is comparable to 
a residential location given the timing of the 
stay before symptom onset (25), it would be 
unlikely to observe both of the earliest line- 
age A cases this near to the Huanan market 
(P = 0.001 or less). The finding that both 
identified lineage A cases had a geographical 
connection to the market, in combination with 
the detection of lineage A within the market 
(24), support the likelihood that during the 
early epidemic, lineage A was, like lineage B, 
disseminating outward from the Huanan market 
into the surrounding neighborhoods. 

Our statistical results were robust to a range 
of factors, for example, the use of an empirical 
control distribution that was based on pre- 
sumptive COVID-19 cases locations later in the 
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Wuhan epidemic (Weibo data); laboratory- 
confirmed versus clinically diagnosed cases; 
and uncertainty in case location or missing 
data (figs. S13 to S15) (25). For instance, we 
artificially introduced location uncertainty 
(“noise”) in each case location in our dataset 
by randomly resampling each point within a 
circle of radius 1000 m centered on its original 
center point, and the conclusions were unaf- 
fected (fig. S13). The extraction method that 
we used actually introduced only up to ~50 m 
of noise in each case location estimate (fig. S7), 
ruling out the possibility that our overall re- 
sults were affected by this source of error. The 
results were also robust when corrected for 
multiple-hypothesis testing (table S4). 


Wild animal trading in Wuhan markets 


In addition to selling seafood, poultry, and 
other commodities, the Huanan market was 
among four markets in Wuhan reported to 
consistently sell a variety of live wild-captured 
or farmed mammal species in the years and 
months leading up to the COVID-19 pandemic 
(8). There are, however, no prior reports of 
which species, if any, were sold at the Huanan 
market in the months leading up to the pan- 
demic. Here, we report that multiple plausible 
intermediate wildlife hosts of SARS-CoV-2 
progenitor viruses, including red foxes (Vulpes 
vulpes), hog badgers (Arctonyx albogularis), 
and common raccoon dogs (Nyctereutes pro- 
cyonoides), were sold live at the Huanan mar- 
ket up until at least November 2019 (Table 1 and 
table S5). No reports are known to be available 
for SARS-CoV-2 test results from these mam- 
mals at the Huanan market. Despite a general 
slowdown in live animal sales during the win- 
ter months, we report that raccoon dogs, which 
are sold for both meat and fur, were consist- 
ently available for sale throughout the year, 
including at the Huanan market in November 
2019 (Table 1 and table S5). 

There were potentially many locations in 
Wuhan, a city of 11 million, that would have 
been equally or more likely than the Huanan 
market to sustain the first recognized cluster of 
a new respiratory pathogen had its introduc- 
tion not been linked to a live animal market, 
including other shopping venues, hospitals, 
elder care facilities, workplaces, universities, 
and places of worship. To investigate possi- 
ble sites, we compared the relative extent of 
intra-urban human traffic to the Huanan mar- 
ket versus other locations within the city of 
Wuhan using a location-specific dataset of 
social media check-ins in the Sina Visitor System 
(25, 33). We found at least 70 other markets 
throughout the city of Wuhan that received 
more social media check-ins than the Huanan 
market (Fig. 3). To extend this analysis beyond 
only markets, we also used a subsequently pub- 
lished list of known SARS-CoV-2 superspreader 
locations (34) to identify 430 locations in Wuhan 
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Fig. 1. Spatial patterns of COVID-19 cases in Wuhan in December 2019 and January-February 2020. 
(A) Locations of the 155 cases that we extracted from the WHO mission report (7). Inset: map of Wuhan with 

the December 2019 cases indicated with gray dots (no cases are obscured by the inset). In both the inset and the 
main panel, the location of the Huanan market is indicated with a red square. (B) Probability density contours 


reconstructed by a KDE using all 155 COVID-19 cases locations from December 2019. 
50% contour marked is the area for which cases drawn from the probability distribut 


The highest density 


as outside. Also shown are the highest density 25%, 10%, 5%, and 1% contours. Inset: expanded view and 


the highest density 1% probability density contour. (C) Probability density contours reconstructed using the 


on are as likely to lie inside 


120 COVID-19 cases locations from December 2019 that were unlinked to the Huanan market. (D) Locations of 


737 COVID-19 cases from Weibo data dating to January-February 2020. (E) The sam 
density contours (50% through 1%) as shown in (B) and (C) for 737 COVID-19 case 


26 AUGUST 2022 « 


e highest probability 


VOL 377 ISSUE 6609 


locations from Weibo data. 


953 


RESEARCH | RESEARCH ARTICLES 


Fig. 2. Spatial analyses. (A) Inset: map 
of Wuhan, with gray dots indicating 

the 1000 random samples from the 
WorldPop.com null distribution. In the 
main panel, the median distance between 
Huanan market and the WorldPop.org 
null distribution is indicated by the outer 
black circle. December 2019 cases are 
indicated by concentric red circles 
(distances to Huanan market are 
described in the purple boxes). The 
center point of Wuhan population density 
data is indicated by a blue dot. Center 


Huanan market 


° 
. Dong 


points of December 2019 case locations 
are shown as follows: red dots indicate 
“all;’ “linked,” and “unlinked” cases, and 
the yellow dot indicates lineage B cases. 
Distance from center points to Huanan 
market are described in orange boxes. 
(B) Schematic showing how cases can be 
near to, but not centered on, a specific 
ocation. We hypothesized that if the 
Huanan market were the epicenter of the 
pandemic, then early cases should fall not 
just unexpectedly near to it but should 
also be unexpectedly centered on it 

(see the materials and methods). The 
blue dots show how hypothetical cases 
quite near the Huanan market could 
nevertheless not be centered on it. 
(C) Tolerance contours based on relative 

isk of COVID-19 cases in December 2019 
versus data from January-February 2020. 

The gray dots show the December case 
ocations. The contours represent the 

probability of observing that density of Cc 
December cases within the bounds of the 

given contour if the December cases 

had been drawn from the same spatial 
distribution as the January-February data. 


that may have been at high risk for super- 
spreader events and which received more check- 
ins than the Huanan market (Fig. 3, inset). The 
Huanan market accounted for 0.12% (120 of 
98,146) of social media check-ins to markets 
in the dataset that received at least as many 
check-ins as the Huanan market. The market 
accounted for 0.04% (120 of 262,233) of all 
social media check-ins to the >400 sites in 
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Wuhan identified as especially likely to be 
potential superspreader locations and which 
received at least as many social media visits as 
the Huanan market. Considering the number 
of check-ins to all four markets selling live wild 
animals in Wuhan (combined), they accounted 
for 0.21% (206 of 98,146) of market visits and 
0.079% (206 of 262,233) of visits to the 430 
potential superspreader sites, where a new 


respiratory disease might first be noticed in 
a large city. 

A dataset from the Chinese Center for Disease 
Prevention and Control (CCDC) report dated 
22 January 2020 (data S1) (72, 13, 15, 16) was 
made publicly available in June 2020 (24, 35). 
A total of 585 environmental samples were 
initially taken from various surfaces in the 
Huanan market on 1 and 12 January 2020 by 
the CCDC (tables S6 and S7 and data S1) 
(12, 13, 15, 16, 24, 35), with further samples 
taken throughout the market during January 
and February (24). We extended the analysis 
in the WHO mission report (7) by integrating 
public online maps and photographic evidence, 
data from public business registries (table S8 
and data S2), information about which live 
mammal species were sold at the Huanan 
market in late 2019 (Table 1 and table S5), 
and the CCDC report (data S1). We recon- 
structed the floor plan of the market and 
integrated information from business registries 
of vendors at the market (fig. S16 and table 
S8), as well as an official report (36) recording 
fines to three business owners for illegal sales 
of live mammals (data S2) (36). From this, we 
identified an additional five stalls that were 
likely selling live or freshly butchered mammals 
or other unspecified meat products in the south- 
west corner of the western section of the market 
(Fig. 4A, figs. S16 and S17, and table S6). 

Five of the SARS-CoV-2-positive environ- 
mental samples were taken from a single stall 
selling live mammals in late 2019 (table S6). 
Further, all five objects sampled showed an 
association with animal sales, including a metal 
cage, two carts (of the kind frequently used to 
transport mobile animal cages), and a hair and 
feather remover (table S6). No human COVID- 
19 cases were reported there (7, 12). The same 
stall was visited by one of us (E.C.H.) in 2014, 
and live raccoon dogs were observed housed 
in a metal cage stacked on top of a cage with 
live birds (Fig. 4A) (37). A recent report (24) 
identified that the grates outside of this stall, 
upon which animal cages were stacked (37), 
were positive for SARS-CoV-2. 


Positive environmental samples linked both to 
live mammal sales and to human cases at the 
Huanan market 


We used a spatial relative risk analysis to 
identify potential regions of the market with 
an increased density of positive environmental 
samples (25). We found evidence (P < 0.05) of 
a region in the southwest area of the market 
where live mammals were for sale (Fig. 4B). 
Although environmental sampling of the mar- 
ket was incomplete and spatially heterogeneous 
(data S1 and table S6), our analysis accounts 
for the empirical environmental sampling 
distribution, which was biased toward “stalls 
related to December cases,” as well as “stalls 
that sold livestock, poultry, farmed wildlife” 
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(7) (Fig. 4, C and D). The “distance to the 
nearest vendor selling live mammals” and the 
“distance to the nearest human case” were 
independently predictive of environmental 
sample positivity (P = 0.004 and 0.014, re- 
spectively, for m = 6; table S9). To further 
investigate the robustness of these findings 
to possible sampling biases, we considered 
three scenarios: (i) oversampling of live mam- 
mal and unknown meat stalls, (ii) overcount- 
ing of positive samples, and (iii) exclusion of 
the seafood stand near the wildlife area of the 
market (with five positive samples) from our 
analysis (table S10). In each case, the distance 
to live mammal vendors remained predictive 
of environmental sample positivity, and the 
region of increased positive sample density in 
the southwest corner of the western section of 
the market remained consistent (fig. S18). 
Finally, to analyze the spatial patterning of 
human cases within the Huanan market, we 
plotted cases as a function of symptom onset 
from the WHO mission report (7) (Fig. 5A and 
table S11) (25). All eight COVID-19 cases detected 
before 20 December 2019 were from the western 
side of the market, where mammal species were 
also sold (Fig. 5, B and C). Unlike SARS-CoV-2- 
positive environmental samples (Fig. 4, A and 
C), we found that COVID-19 cases were more 
diffuse throughout the building (Fig. 5). 


Study limitations 


There are several limitations to our study. 
We have been able to recover location data 


for most of the December-onset COVID-19 
cases identified by the WHO mission (7) with 
sufficient precision to support our conclusions. 
However, we do not have access to the precise 
latitude and longitude coordinates of all of 
these cases. Should such data exist, they may 
be accompanied by additional metadata, some 
of which we have reconstructed, but some of 
which, including the date of onset of each case, 
would be valuable for ongoing studies. We also 
lack direct evidence of an intermediate animal 
infected with a SARS-CoV-2 progenitor virus 
either at the Huanan market or at a location 
connected to its supply chain, such as a farm. 
Additionally, no line list of early COVID-19 
cases is available, and we do not have com- 
plete details of environmental sampling. How- 
ever, compared with many other outbreaks, 
we have more comprehensive information on 
early cases, hospitalizations, and environmental 
sampling (7). 


Discussion 


Several lines of evidence support the hypoth- 
esis that the Huanan market was the epicenter 
of the COVID-19 pandemic and that SARS- 
CoV-2 emerged from activities associated with 
the live wildlife trade there. Spatial analyses 
within the market show that SARS-CoV-2- 
positive environmental samples, including 
cages, carts, and freezers, were associated with 
activities concentrated in the southwest cor- 
ner of the market. This is the same section 
where vendors were selling live mammals, 


including raccoon dogs, hog badgers, and 
red foxes, immediately before the COVID-19 
pandemic. Multiple positive samples were 
taken from one stall known to have sold live 
mammals, and the water drain proximal to 
this stall, as well as other sewerages and a 
nearby wildlife stall on the southwest side of 
the market, tested positive for SARS-CoV-2 
(24). These findings suggest that infected ani- 
mals were present at the Huanan market at the 
beginning of the COVID-19 pandemic; how- 
ever, we do not have access to any live animal 
samples from relevant species. Additional in- 
formation, including sequencing data and de- 
tailed sampling strategy, would be invaluable 
to test this hypothesis comprehensively. 

In a related study, we inferred separate 
introductions of SARS-CoV-2 lineages A and 
B into humans from likely infected animals 
at the Huanan market (38). We estimated 
the first COVID-19 case to have occurred 
in November 2019, with few human cases 
and hospitalizations occurring through mid- 
December (38). A recent preprint (24) confirms 
the authenticity of the CCDC report (data S1) 
and records additional positive environmental 
samples in the southwestern area of the mar- 
ket selling live animals. This report also docu- 
ments the early presence of the A lineage of 
SARS-CoV-2 in a Huanan market environ- 
mental sample. This, along with the lineage 
A cases that we report in close geographical 
proximity to the market in December 2019, 
challenges the suggestion that the market was 


——— ee 
Table 1. Live mammals traded at the Huanan market in November and December 2019. 


Species (susceptibility*) 


Raccoon dog (Nyctereutes procyonoides) (Y) 


*Based on live susceptibility findings, serological findings, or ACE2-binding assays. See table S5 for details and associated references. 


Wuhan market during the 2017-2019 study period (8). 
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Complex-toothed flying squirrel (Trogopterus xanthipes) 


Family auf eys? 
(susceptibility*) Order (susceptibility*) 
Canidae (Y) Carnivora (Y) 


Mustelidae (Y) Carnivo 


Lagomorpha (Y) 


Carniv 


Sciuridae Rodentia (Y 


Observed at Huanan market November 2019 


{Animals listed as “N” (no) were, however, present at 
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Fig. 3. Visitors to locations throughout Wuhan. Shown is the number of 


social media check-ins in the Sina Visitor System from 2013 to 2014 as shared 
by (33). The numbers of check-ins to individual markets throughout the city are 


shown in comparison with check-ins at the Huanan market. Inset: the total 


number of check-ins to all individual locations across the city of Wuhan grouped 
by category. Locations with >50 visitor check-ins are shown, and the locations 
that received more check-ins than the Huanan market in the same period 

are shown in red. 


simply a superspreading event, which would 
be lineage specific. Rather, it adds to the evi- 
dence presented here that lineage A, like 
lineage B, may have originated at the Huanan 
market and then spread from this epicenter 
into the neighborhoods surrounding the market 
and beyond. 

Several observations suggest that the geo- 
graphic association of early COVID-19 cases 
with the Huanan market is unlikely to have 
been the result of ascertainment bias (see the 
supplementary text and tables S2 and S3) (39). 
These include that (i) few, if any, cases among 
Huanan market-unlinked individuals are likely 
to have been detected by active searching in 
the neighborhoods around the market, only in 
hospitals, because all of the cases analyzed here 
were hospitalized (7); (ii) public health officials 
simultaneously became aware of Huanan-linked 
cases both near and far from the Huanan mar- 
ket, not just the ones near it (fig. S11) (5); iii) 
Huanan market-unlinked cases would not 
be expected to live significantly closer to the 
market than linked cases if they had been 
ascertained as contacts traced from those 
market-linked cases; and (iv) seroprevalence 
in Wuhan was highest in the districts around 
the market (40, 41). It is also noteworthy that 
the December 2019 COVID-19 cases that we 
consider here were identified based on re- 
views of clinical signs and symptoms, not 
epidemiological factors such as where they 
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resided or links to the Huanan market (7), 
and that excess deaths from pneumonia rose 
first in the districts surrounding the market 
(42). Moreover, the spatial relationship with 
the Huanan market remains after removing 
the two-thirds of the unlinked cases residing 
nearest the market. 

One of the key findings of our study is that 
“anlinked’ early COVID-19 patients, i.e., those 
who did not work at the market, did not know 
someone who did, and had not recently visited 
the market, resided significantly closer to the 
market than patients with a direct link to it. 
The observation that a substantial proportion 
of early cases had no known epidemiological 
link had previously been used as an argument 
against the Huanan market being the epicen- 
ter of the pandemic. However, this group of 
cases resided significantly closer to the market 
than those who worked there, indicating that 
they had been exposed to the virus at or near 
the Huanan market. For market workers, the 
exposure risk was their place of work, not their 
residential locations, which were significantly 
farther afield than those cases not formally 
linked to the market. 

Our spatial analyses show how patterns of 
COVID-19 cases shifted between late 2019, 
when the outbreak began (43), and early 2020, 
as the epidemic spread widely across Wuhan. 
COVID-19 cases in December 2019 were asso- 
ciated with the Huanan market in a manner 


unrelated to Wuhan population density or 
demographic patterns, unlike the wide spatial 
distribution of cases observed during later 
stages of the epidemic in January-February 
2020. This observation fits with the evidence 
from other sources that SARS-CoV-2 was not 
widespread in Wuhan at the end of 2019. For 
example, no SARS-CoV-2-positive sera or influ- 
enza-like illness reports were recorded among 
more 40,000 blood donor samples collected 
up to December 2019 (44, 45), and none of 
thousands of samples taken from patients with 
influenza-like illness at Wuhan hospitals in 
October to December 2019 tested for SARS- 
CoV-2 RNA was positive (7). 

The sustained presence of a potential source 
of virus transmission into the human popula- 
tion in late 2019, plausibly from infected live 
mammals sold at the Huanan market, offers 
an explanation of our findings and the origins 
of SARS-CoV-2. The pattern of COVID-19 cases 
reported for the Huanan market, with the 
earliest cases in the same part of the market 
as the wildlife sales and evidence of at least 
two introductions (38), resembles the multiple 
cross-species transmissions of SARS-CoV-2 sub- 
sequently observed during the pandemic from 
animals to humans on mink farms (46) and 
from infected hamsters to humans in the pet 
trade (47). There was an extensive network 
of wildlife farms in western Hubei Province, 
including hundreds of thousands of raccoon 
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Fig. 4. Map of the Huanan market. (A) Aggregated environmental sampling 
and human case data from the Huanan market. Captions describe the types 

of SARS-CoV-2-positive environmental samples obtained from known live animal 
vendors (left) and from stalls with samples with known virus lineage (center). 
Lineage is unknown unless noted; sequencing data have not been released 

for some samples, and many samples were PCR-positive but not sequenced. 
Image at left shows raccoon dogs in a metal cage on top of caged birds from a 
business with five positive environmental samples (photo by E.C.H.). Center: 
Rectangle with dashed outline indicates the “wildlife” section of the market. 
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(B) Relative risk analysis of positive environmental samples. Tolerance 
contours enclose regions with statistically significant elevation in density of 
positive environmental samples relative to the distribution of sampled stalls. 
(C) Distribution of positive environmental samples. Sample locations (centroid of 
corresponding business) and quantity are shown as black circles. (D) Control 
distribution for relative risk analysis. All businesses investigated with 
environmental sampling are shown as black circles (there is one circle per 
business regardless of whether a positive sample was found). See table 

$12 for details on stalls that were SARS-CoV-2-negative. 


dogs on farms in Enshi Prefecture, which 
supplied the Huanan market (48). This region 
of Hubei contains extensive cave complexes 
housing Rhinolophus bats, which carry SARSr- 
CoVs (49). SARS-CoV-1 was recovered from 
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farmed masked palm civets (Paguma larvata) 
from Hubei in 2003 and 2004 (20). The ani- 
mals on these farms (nearly 1 million) were 
rapidly released, sold, or killed in early 2020 
(48), apparently without testing for SARS-CoV-2 


(7). Live animals sold at the market (Table 1) 
were apparently not sampled either. By con- 
trast, during the SARS-CoV-1 outbreaks, farms 
and markets remained open for more than 
a year after the first human cases occurred, 
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Fig. 5. Location and timing of human cases in Huanan market. (A) Outline colors correspond to the 
timing of the first known case in each business. Individual case timing is denoted by marker color and shown 
within the outlined business. (B) Distribution of known cases on or before 20 December 2019. Case locations 
are shown as black circles. (C) Distribution of all known human cases in Huanan market. See table S11 

for details on SARS-CoV-2-positive human cases with the Huanan market. 


allowing sampling of viruses from infected 
animals (20). 

The live animal trade and live animal mar- 
kets are a common theme in virus spillover 
events (21-23, 50), with markets such as the 
Huanan market selling live mammals being 
in the highest risk category (57). The events 
leading up to the COVID-19 pandemic mir- 
ror the SARS-CoV-1 outbreaks from 2002 to 
2004, which were traced to infected animals 
in the Guangdong, Jiangxi, Henan, Hunan, 
and Hubei provinces in China (20). Maximum 
effort must now be applied to elucidate the 
upstream events that might have brought 
SARS-CoV-2 into the Huanan market, culmi- 
nating in the COVID-19 pandemic. To reduce 
the risk of future pandemics, we must under- 
stand, and then limit, the routes and oppor- 
tunities for virus spillover. 
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Methods summary 

Ethics statement 

This research was reviewed by the Human 
Subject Protection Program at the University 
of Arizona and the Institutional Review Board 
(IRB) at The Scripps Research Institute and 
determined to be exempt from IRB approval 
because it constitutes secondary research for 
which consent is not required. 


Data sources 


COVID-19 case data from December 2019 were 
obtained from the WHO mission report (7) 
and from our previous analyses (5). Location 
information was extracted and sensitivity 
analyses performed to confirm accuracy and 
assess potential ascertainment bias. Geotagged 
January-February 2020 data from Weibo 
COVID-19 help seekers was obtained from 


the authors (26). Population density data were 
obtained from WorldPop.org (27). Sequencing- 
or quantitative polymerase chain reaction 
(PCR)-based environmental sample SARS- 
CoV-2 positivity from the Huanan market was 
obtained from a January 2020 CCDC report 
(data S1) (24). 


Wildlife trading at the Huanan market 


Animal sales from Wuhan wet markets im- 
mediately before the COVID-19 pandemic were 
previously reported (8), and in this study we 
report details about animals for sale at the 
Huanan market up until November 2019. 


Spatial analyses of COVID-19 cases 


Haversine distances to the Huanan market 
were calculated for each of the geolocated 
December 2019 cases. Center points and me- 
dian distances from cases to the Huanan market 
were calculated separately for (i) all 155 cases, (ii) 
the 35 cases epidemiologically linked to the 
Huanan market, (iii) the 120 cases not epidemi- 
ologically linked to the market, (iv) the 11 lineage 
B cases, and (v) the earliest lineage A case. These 
distances were also calculated for the 737 Weibo 
help seekers from 8 January to 10 February 2020 
(26). Empirical null distributions were generated 
from the population density data and the Weibo 
data. The population density-null distributions 
were age-matched to the December 2019 cases. 
KDEs were also generated for the market- 
linked cases, unlinked cases, and all cases to 
infer a probability density function from which 
the cases could have been drawn. Highest- 
density contours representing specific prob- 
ability masses (0.5, 0.25, 0.1, 0.05, and 0.01) 
were inferred, and the location of the market 
was compared with these. 


Mobility analyses 


To estimate the relative amount of intra-urban 
human traffic to the Huanan market com- 
pared with other locations within the city of 
Wuhan, we used a location-specific dataset 
of social media check-ins in the Sina Visitor 
System as shared by Li et al. (33). This dataset 
is based on 1,491,499 individual check-in events 
across the city of Wuhan from the years 2013- 
2014 (5 to 6 years before the start of the COVID- 
19 pandemic), and 770,521 visits were associated 
with 312,190 unique user identifiers. Location 
names and categories were translated using a 
Python API for Google Translate. 


Spatial analyses of environmental samples at 
the Huanan market 


We used the official maps from the CCDC (12) 
(data S1) and the WHO map (7), as well as 
satellite photographs (Google Maps, Google 
Earth, Baidu Maps), aerial photographs, and 
images of the market in the public domain to 
reconstruct the floorplan of the market. Market 
stalls were assigned by categories of the types of 
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goods sold using official reports and data from 
the TianYanCha.com business directory (this 
company has since gone out of business; for 
screenshots, see table S8 and data S2). Final 
maps of the Huanan market were converted 
into geoJSON format for spatial analyses. 
Significance testing of live animal vendors 
and/or human SARS-CoV-2 cases on the num- 
ber of positive environmental samples was 
performed using a binomial general linear 
model. Distances between businesses were 
defined as the distance between their respec- 
tive center points, and spatial relative risk 
analysis was performed using the ‘sparr’ pack- 
age in R, with linear boundary kernels for edge 
correction (52) and bandwidth selection per- 
formed using least-squares cross-validation. 
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Understanding the circumstances that lead to pandemics is important for their prevention. We 
analyzed the genomic diversity of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) 
early in the coronavirus disease 2019 (COVID-19) pandemic. We show that SARS-CoV-2 genomic 
diversity before February 2020 likely comprised only two distinct viral lineages, denoted “A” and “B.” 
Phylodynamic rooting methods, coupled with epidemic simulations, reveal that these lineages were 
the result of at least two separate cross-species transmission events into humans. The first zoonotic 
transmission likely involved lineage B viruses around 18 November 2019 (23 October to 8 December), 
and the separate introduction of lineage A likely occurred within weeks of this event. These findings 
indicate that it is unlikely that SARS-CoV-2 circulated widely in humans before November 2019 and define 


the narrow window between when SARS-CoV-2 first jumped into humans and when the first cases of 
COVID-19 were reported. As with other coronaviruses, SARS-CoV-2 emergence likely resulted from 


multiple zoonotic events. 


evere acute respiratory syndrome coro- 

navirus 2 (SARS-CoV-2) is responsible for 

the coronavirus disease 2019 (COVID-19) 

pandemic that caused more than 5 mil- 

lion confirmed deaths in the 2 years after 
its detection at the Huanan Seafood Wholesale 
Market (hereafter the “Huanan market”) in 
December 2019 in Wuhan, China (/-3). As the 
original outbreak spread to other countries, 
the diversity of SARS-CoV-2 quickly increased 
and led to the emergence of multiple variants 
of concern, but the beginning of the pandemic 
was marked by two major lineages denoted 
“A” and “B” (4). 

Lineage B has been the most common 
throughout the pandemic and includes all 11 
sequenced genomes from humans directly 
associated with the Huanan market, includ- 
ing the earliest sampled genome, Wuhan/ 
IPBCAMS-WH-01/2019, and the reference ge- 
nome, Wuhan/Hu-1/2019 (hereafter “Hu-1”) 
(5), sampled on 24 and 26 December 2019, 
respectively. The earliest lineage A viruses, 
Wuhan/IME-WHO01/2019 and Wuhan/WH04/ 
2020, were sampled on 30 December 2019 and 
5 January 2020, respectively (6). Lineage A 
differs from lineage B by two nucleotide sub- 
stitutions, C8782T and T28144C, which are 
also found in related coronaviruses from 
Rhinolophus bats (4), the presumed host res- 
ervoir (7). Lineage B viruses have a “C/T” pat- 
tern at these key sites (C8782 and T28144), 
whereas lineage A viruses have a “T/C” pattern 
(C8782T and T28144C). The earliest lineage A 
genomes from humans lack a direct epidemi- 
ological connection to the Huanan market but 
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were sampled from individuals who lived or had 
recently stayed close to the market (8). It has 
been hypothesized that lineages A and B emerged 
separately (9), but “C/C” and “T/T” genomes 
intermediate to lineages A and B present a chal- 
lenge to that hypothesis because their existence 
suggests within-human evolution of one lineage 
toward the other by way of a transitional form. 

Questions about these lineages remain: If 
lineage B viruses are more distantly related to 
sarbecoviruses from Rhinolophus bats, then (i) 
why were lineage B viruses detected earlier 
than lineage A viruses, and (ii) why did lineage 
B predominate early in the pandemic? 

Answering these questions requires determin- 
ing the ancestral haplotype, the genomic se- 
quence characteristics of the most recent common 
ancestor (MRCA) at the root of the SARS-CoV-2 
phylogeny. In this study, we combined genomic 
and epidemiological data from early in the 
COVID-19 pandemic with phylodynamic models 
and epidemic simulations. We eliminated many 
of the haplotypes previously suggested as 
the MRCA of SARS-CoV-2 and show that the 
pandemic most likely began with at least two 
separate zoonotic transmissions starting in 
November 2019. 


Results 
Erroneous assignment of haplotypes 
intermediate to lineages A and B 


There are 787 near-full-length genomes 
available from lineages A and B sampled by 
14 February 2020 (data S1 and $2). However, 
there are also 20 genomes of intermediate 
haplotypes from this period that contain either 


T28144C or C8782T but not both mutations: 
C/C or T/T, respectively. 

We identified numerous instances of C/C 
and T/T genomes sharing rare mutations with 
lineage A or lineage B viruses, often sequenced 
in the same laboratory, indicating that these 
intermediate genomes are likely artifacts of 
contamination or bioinformatics (JO), similar 
to findings from our analysis of the emergence 
of SARS-CoV-2 in North America (fig. S1 and 
supplementary text) (17). We confirmed that a 
C/C genome from South Korea sharing three 
such mutations had low sequencing depth at 
position 28144 (<10x), a T/T genome sampled 
in Singapore had low coverage at both 8782 
and 28144 (<10x), and three T/T genomes 
sampled in Wuhan had low sequencing depth 
and indeterminate nucleotide assignment at 
position 8782 (table S1). Further, the authors 
of 11 C/C genomes sampled in Wuhan and 
Sichuan confirmed that low sequencing depth 
at position 8782 led to the erroneous assign- 
ment of intermediate haplotypes. 

C/C and T/T genomes continue to be ob- 
served throughout the pandemic as a result 
of convergent evolution, including T/T in the 
Diamond Princess cruise ship outbreak and 
subsequent COVID-19 waves in New York City 
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and San Diego (figs. S2 to S5 and supplemen- 
tary text). Instances of convergent evolution 
are identifiable because SARS-CoV-2 phylog- 
enies exist in “near-perfect” tree space, in 
which topology can be inferred with high 
accuracy (12). These findings cast doubt on 
the claim that transitional C/C or T/T haplo- 
types between lineages A and B circulated in 
humans, reopening the door to the hypoth- 
esis that lineages A and B represent separate 
introductions. 


Lineage B 


Lineage A eS 


Progenitor genome reconstruction 

To better understand SARS-CoV-2 mutational 
patterns, we reconstructed the genome of a 
hypothetical progenitor of SARS-CoV-2. Using 
maximum likelihood ancestral state recon- 
struction across 15 nonrecombinant regions 
of SARS-CoV-2 and closely related sarbecovi- 
rus genomes sampled from bats and pangolins 
(13), we inferred the genome of this recombi- 
nant common ancestor (recCA) (figs. S6 and S7 
and supplementary text). The recCA differed 


— C-to-T reversions 
— Other reversions 
Hu-1 

CIC or T/T 

WH04 

20SF012 

WA1 
MKAK-CL-2020-6430 


9??resod 


Fig. 1. Maximum likelihood phylogeny of the early SARS-CoV-2 pandemic, showing nucleotide 
reversions and putative candidates for the ancestral haplotype at the MRCA. Putative ancestral 
haplotypes are identified with colored shapes. Reversions from the Hu-1 reference genome to the recCA 
are colored. Blue indicates C-to-T reversions, and black indicates all other reversions. The tree is rooted 


on Hu-1 to show reversion dynamics to the recCA. 


from Hu-1 by just 381 substitutions, including 
C8782T and T28144C. It is more informative 
than an outgroup sarbecovirus because it ac- 
counts for the closest relative across all re- 
combinant segments (figs. S8 to S14 and 
supplementary text) (J4) and, as an internal node 
on the phylogeny, is more genetically similar to 
SARS-CoV-2 than any extant sarbecovirus. 


Reversions across the early pandemic phylogeny 


The ubiquity of SARS-CoV-2 reversions (muta- 
tions from Hu-1 toward the recCA) indicates 
that genetic similarity to related viruses is a 
poor proxy for the ancestral haplotype. We 
observe 23 distinct reversions and 631 distinct 
substitutions (excluding reversions) across the 
SARS-CoV-2 phylogeny from the COVID-19 
pandemic up to 14 February 2020 (Fig. 1). 
Substitutions were overrepresented at the 
381 sites separating the recCA from Hu-1 (23 of 
381, 6.04%), compared with substitutions at all 
other sites (631 of 29,134, 2.17%). 

Most reversions were C-to-T mutations (19 
of 23, 82.6%), matching the mutational bias of 
SARS-CoV-2 (15-17). Genomes with C-to-T re- 
versions can be found within lineage A, in- 
cluding C18060T (lineage A.1; for example, 
WAI) and C29095T (for example, 20SF012), 
as well as C24023T, C25000T, C4276T, and 
C22747T in mid-late January and February 
2020. Hence, triple revertant genomes, such 
as WA1 and 20SFO012, are neither unique nor 
rare. We also identified a lineage A genome 
(Malaysia/MKAK-CL-2020-6430/2020), sampled 
on 4 February 2020 from a Malaysian citizen 
traveling from Wuhan whose only four muta- 
tions from Hu-1 are all reversions (lineage A.1 
+T6025C) (Fig. 1). Therefore, no highly rever- 
tant haplotype can automatically be assumed 
to represent the MRCA of SARS-CoV-2, espe- 
cially when these reversions are most often the 
result of C-to-T mutations. We continue to 
observe these reversion patterns throughout 
the pandemic, including in the emergence of 


eee 
Table 1. Posterior probabilities of inferred ancestral haplotype at the MRCA of SARS-CoV-2. Positions 8782 and 28144 are indicated in parentheses. 
Representative genome is genome with sequence matching the haplotype. “No market” excludes 15 market-associated genomes (13 lineage B genomes 
associated with the Huanan market plus one lineage A and one lineage B genome not associated with the Huanan market). *BF > 10; **BF > 100; ***BF > 


1000. BFs are in favor of hypothesis rejection. 


Haplotype 


B (C/T) 


Ai (T/C) 
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C8782T+T28144C+C18060T 
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Fig. 2. Probability of phylogenetic structures arising from a single introduction of SARS-CoV-2 in 
epidemic simulations. (A) A large polytomy of at least 100 descendent lineages, which is consistent 

with the base of both lineages A and B. (B) Topology matching a C/C ancestral haplotype: two clades, each 
one mutation from the ancestor, both with polytomies of at least 100 descendent lineages. (C) Topology 
matching either a lineage A or lineage B ancestral haplotype: a basal polytomy with at least 100 descendent 
lineages, including a large clade separated by two mutations, also possessing a polytomy of at least 

100 descendent lineages. Basal taxa have short branch lengths for clarity. The probability of each 
phylogenetic structure after a single introduction is reported in the respective boxes. 


World Health Organization (WHO)-named 
variants (figs. S15 and S16). 


Inferring the MRCA of SARS-CoV-2 


To infer the ancestral SARS-CoV-2 haplotype, 
we developed a nonreversible, random-effects 
substitution process model in a Bayesian phylo- 
dynamic framework that simultaneously recon- 
structs the underlying coalescent processes and 
the sequence of the MRCA of the SARS-CoV-2 
phylogeny. The random-effects substitution 
model captures the C-to-T transition and G-to- 
T transversion biases (fig. S17 and supplemen- 
tary text). Using this model, referred to as the 
unconstrained rooting (fig. SI8A), we inferred 
the ancestral haplotype of the 787 lineage A 
and B genomes sampled by 14 February 2020. 

Our unconstrained rooting strongly favors 
a lineage B or C/C ancestral haplotype and 
shows that a lineage A ancestral haplotype is 
inconsistent with the molecular clock [Bayes 
factor (BF) = 48.1] (Table 1). Lineage B exhibits 
more divergence from the root of the tree than 
would be expected if lineage A were the an- 
cestral virus in humans (figs. S19 and S20). 
The T/T ancestral haplotype was also disfa- 
vored (BF > 10), likely because of the C-to-T 
transition bias (fig. S17). We acknowledge that 
the timing of the earliest sampled lineage B 
genomes associated with the Huanan market 
could bias rooting inference toward lineage B 
haplotypes; however, lineage A was still disfavored 
after excluding all market-associated genomes 
(BF = 11.0). 

Even though sequence similarity to closely 
related sarbecoviruses alone is insufficient to 
determine the SARS-CoV-2 ancestral haplo- 
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type, this similarity can inform phylodynamic 
inference. Rather than rely on outgroup rooting 
(fig. S18B) (18), we developed a rooting method 
that assigns the recCA as the progenitor of 
the inferred SARS-CoV-2 MRCA (fig. S18C). 
As opposed to the unconstrained rooting, 
the recCA root favored a lineage A haplotype 
over lineage B, although support for C/C was 
unchanged (Table 1). Our results were insen- 
sitive to the method of breakpoint identifi- 
cation in the recCA (supplementary text). 

The A.1 and A+C29095T proposed ancestral 
haplotypes were strongly rejected by all the 
phylodynamic analyses, even when rooting 
with recCA or bat sarbecovirus outgroups, 
which include both C18060T and C29095T 
(Table 1 and data S3). Hence, WA1-like and 
20SFO012-like haplotypes cannot plausibly rep- 
resent the MRCA of SARS-CoV-2 as previously 
suggested (19-21); the similarity of these ge- 
nomes to the recCA is due to C-to-T reversions. 
Haplotypes not reported in Table 1 were simi- 
larly rejected (data S3). 

We inferred the time of MRCA (tMRCA) 
for SARS-CoV-2 to be 11 December 2019 
[95% highest posterior density (HPD) inter- 
val, 25 November to 12 December] by using 
unconstrained rooting. It has been suggested 
that a phylogenetic root in lineage A would 
produce an older tMRCA than would a lineage 
B rooting (27). Therefore, we developed an 
approach to assign a haplotype as the SARS- 
CoV-2 MRCA (A, B, C/C, A.1, or A+C29095T) and 
inferred the tMRCA (fig. S18D). The tMRCA was 
consistent with the recCA-rooted and fixed 
ancestral haplotype analyses (table S2 and sup- 
plementary text). 


We infer only three plausible ancestral 
haplotypes: lineage A, lineage B, and C/C. 
However, the inability to reconcile the molec- 
ular clock at the outset of the COVID-19 
pandemic with a lineage A ancestor without 
information from related sarbecoviruses (such 
as the recCA) requires us to question the as- 
sumption that both lineages A and B resulted 
from a single introduction. 


Separate introductions of lineages A and B 


We next sought to determine whether a single 
introduction from one of the plausible ances- 
tral haplotypes (lineage A, lineage B, or C/C) is 
consistent with the SARS-CoV-2 phylogeny. 
We simulated SARS-CoV-2-like epidemics 
(22, 23) with a doubling time of 3.47 days 
[95% highest density interval (HDI) across 
simulations, 1.35 to 5.44] (24-26) to account 
for the rapid spread of SARS-CoV-2 before it 
was identified as the etiological agent of 
COVID-19 (figs. S21 and S22, tables S3 and 
S4, and supplementary text). We then simu- 
lated coalescent processes and viral genome 
evolution across these epidemics to determine 
how frequently we recapitulated the observed 
SARS-CoV-2 phylogeny. 

Lineages A and B comprise 35.2 and 64.8% 
of the early SARS-CoV-2 genomes, respectively, 
and each lineage is characterized by a large 
polytomy (many sampled lineages descend- 
ing from a single node on the phylogenetic 
tree), with the base of lineages A and B being 
the two largest polytomies observed in the 
early pandemic (Fig. 1). Furthermore, large 
polytomies are characteristic of SARS-CoV-2 
introductions into geographical regions at 
the start of the pandemic (for example, fig. 
$23) (11, 27-29) and would similarly be ex- 
pected to occur after a successful introduction 
of SARS-CoV-2 into humans. Congruently, the 
most common topology in our simulations is a 
large basal polytomy (with =100 descendent 
lineages), which is present in 47.5% of simu- 
lated epidemics (Fig. 2A). 

By contrast, a topology corresponding to 
a single introduction of an ancestral C/C 
haplotype—characterized by two clades, each 
comprising =>30% of the taxa, possessing a 
large polytomy at the base, and separated 
from the MRCA by one mutation (Fig. 2B)— 
was only observed in 0.0% of our simulations. 
Further, a topology corresponding to a single 
introduction of an ancestral lineage A or line- 
age B haplotype—characterized by a large 
basal polytomy and a large clade, comprising 
between 30 and 70% of taxa, two mutations 
from the root with no intermediate genomes— 
was observed in only 0.5% of our simulations 
(Fig. 2C and supplementary text). 

Our epidemic simulations do not support 
a single introduction of SARS-CoV-2 giving 
rise to the observed phylogeny. We there- 
fore quantified the relative support for two 
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Fig. 3. Comparison of the tMRCA and primary case dates for lineage A and 
lineage B in late 2019 across rooting strategies. Each row represents a 
different rooting constraint in phylodynamic analysis, with lineage B, C/C, and 


lineage A representing a fixed ancestral haplotype. (A 
and B. (B) The number of weeks the tMRCA of lineage 
of lineage B. (C) The timing of the primary case for li 


introductions resulting in the empirical to- 
pology. By synthesizing posterior proba- 
bilities of inferred ancestral haplotypes, 
frequencies of topologies in epidemic sim- 
ulations, and the expected relationships be- 
tween these haplotypes and topologies, we 
inferred strong support favoring separate 
introductions of lineages A and B (BF = 61.6 
and BF = 60.0 by using the recCA and un- 
constrained rooting, respectively) [supplemen- 
tary materials (SM), materials and methods]. 
This support is robust across shorter and 
longer doubling times, varying ascertain- 
ment rates, and minimum polytomy size (tables 
S4 and S5). 

If lineages A and B arose from separate 
introductions, then the MRCA of SARS-CoV-2 
was not in humans, and it is the tMRCAs of 
lineages A and B that are germane to the ori- 
gins of SARS-CoV-2 (not the timing of their 
shared ancestor). Rooting with the recCA, we 
inferred the median tMRCA of lineage B to 
be 15 December (95% HPD, 5 December to 
23 December) and the median tMRCA of line- 


The tMRCA for lineages A 
A occurs after the tuRCA 
neages A and B. (D) The 


number of weeks the time of the primary case of lineage A occurs after the 
time of the primary case of lineage B. Long dashed lines indicate the median, 
and shading indicates the 95% HPD for each distribution. Short dashed lines 
indicate 0 weeks difference between lineages A and B. Posterior probability that 
lineage A originated after lineage B is reported in the gray box in each graph 

in (B) and (D). 


to 29 December) (Fig. 3A). The tMRCA of 
lineage B consistently predates the tMRCA 
of lineage A (Fig. 3B). These results are robust 
to using unconstrained rooting, fixing the 
ancestral haplotype, and excluding market- 
associated genomes (Fig. 3, A and B; table $2; 
and supplementary text). 


Timing the introductions of lineages A and B 


The primary case, the first human infected 
with a virus in an outbreak, could precede the 
tMRCA if basal lineages went extinct during 
cryptic transmission (23, 30, 31). The index 
case, the first identified case, is rarely also 
the primary case (32, 33). We next used an 
extension of our previously published framework 
that combines epidemic simulations and phylo- 
dynamic tMRCA inference (SM materials and 
methods) (23, 30, 32) to infer the timing of the 
lineage B and lineage A primary cases, ac- 
counting for both the index case symptom 
onset date and earliest documented COVID- 
19 hospitalization date. 

The earliest unambiguous case of COVID- 


age A to be 20 December (95% HPD, 5 December 
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19, with symptom onset on 10 December and 


hospitalization on 16 December, was a seafood 
vendor at the Huanan market. Unfortunately, 
no published genome is available for this case 
(8). Nonetheless, we can reasonably assume 
that this individual had a lineage B virus (sup- 
plementary text) because an environmental 
sample (EPI_ISL_408512) from the stall this 
vendor operated was lineage B. The earliest 
lineage A genome (IME-WHOl) is from a 
familial cluster for which the earliest symp- 
tom onset is 15 December and earliest hos- 
pitalization is 25 December (34). Accounting 
for these dates and using the recCA rooting, 
we inferred the infection date of the lineage B 
primary case to be 18 November (95% HPD, 
23 October to 8 December) and the infec- 
tion date of the primary case of lineage A 
to be 25 November (95% HPD, 29 October to 
14 December). The lineage B primary case 
predated that of lineage A in 64.6% of the 
posterior sample, by a median of 7 days (Fig. 3D 
and table S6). 

Our lineage A and B primary case infer- 
ence is robust to rooting strategy and fixing 
the plausible ancestral haplotype to lineage A, 
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Fig. 4. Dynamics of simulated SARS-CoV-2 epidemics resulting from 
separate introductions of lineages A and B in late 2019. Each row 
represents a different rooting constraint in phylodynamic analysis, with lineage 
B, C/C, and lineage A representing a fixed ancestral haplotype. (A) Estimated 
number of infections. The header of each column indicates whether the number 


lineage B, or C/C, as well as different index 
case dates, accounting for only hospitalization 
dates and varying growth rates and ascer- 
tainment rates (tables S7 to S10 and supple- 
mentary text). Therefore, our results indicate 
that lineage B was introduced into humans 
no earlier than late October and likely in 
mid-November 2019, and the introduction of 
lineage A occurred within days to weeks of 
this event. 

We then inferred the number of ascertained 
infections and hospitalizations arising from 
these separate introductions. We found that 
an earlier introduction of lineage B led toa 
faster rise in lineage B-associated infec- 
tions, dominating the simulated epidemics 
(Fig. 4) and recapitulating the predominance 
of lineage B observed in China in early 2020 
(35). Similarly, simulated lineage B hospital- 
izations are more common than those from 
lineage A through January 2020 (fig. S24). 
We observed these patterns regardless of 
rooting strategy (unconstrained or recCA), 
ancestral haplotype (B, A, or C/C) (Fig. 4 
and tables S11 and S12), and doubling time 
(figs. S25 to $28). 


964 26 AUGUST 2022 * VOL 377 ISSUE 6609 


Gu 
g 
9 
° 


gray box in each graph. 


Minimal cryptic circulation of SARS-CoV-2 

We do not see evidence for substantial cryptic 
circulation before December 2019 (Fig. 4), even 
if we assume a single introduction (fig. S29 
and supplementary text). Our simulated epi- 
demics have a median of three (95% HPD, 
1 to 18) cumulative infections at the tMRCA, 
with 99% of simulated epidemics resulting in 
at most 33 infections (table S13 and supple- 
mentary text). Further, it is unlikely that there 
were any COVID-19-related hospitalizations 
before December (36) because the simulated 
epidemics show a median of zero (95% HPD, 
0 to 2) hospitalizations by 1 December 2019. 
These results are in accordance with the lack 
of a single SARS-CoV-2-positive sample among 
tens of thousands of serology samples from 
healthy blood donors from September to 
December 2019 (37) and thousands of speci- 
mens obtained from influenza-like illness 
patients at Wuhan hospitals from October to 
December 2019 (34). Therefore, there was 
likely extremely low prevalence of SARS-CoV-2 
in Wuhan before December 2019. Even when 
we simulated epidemics with a longer doubl- 
ing time, resulting in an earlier timing of the 


Lineage A 


Ratio of BtoA 
infections 


of infections is caused by lineage A, lineage B, or the two lineages combined. 
Darker and lighter shading indicates the 50 and 95% HPD, respectively. (B) The 
log ratio of lineage B to lineage A infections on 15 December 2019. Posterior 
probability of having more lineage B infections than lineage A reported in the 


primary cases (tables S8 and S10), there were 
still few infections before December 2019 
(table S13). 


Additional introductions 


The extinction rate of our simulated epidemics 
(simulations that did not produce self-sustaining 
transmission chains) indicate that there were 
likely multiple failed introductions of SARS- 
CoV-2. Similar to our previous findings (23), 
77.8% of simulated epidemics went extinct. 
These failed introductions produced a mean 
of 2.06 infections and 0.10 hospitalizations; 
hence, failed introductions could easily go un- 
noticed. If we treat each SARS-CoV-2 intro- 
duction, failed or successful, as a Bernoulli 
trial and simulate introductions until we see 
two successful introductions, we estimate that 
eight (95% HPD, 2 to 23) introductions led to 
the establishment of both lineage A and B in 
humans. 


Limitations 


Our analysis of the putative intermediate 
haplotypes suggests that there remain lineage 
assignment errors between lineages A and B, 
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particularly of genomes sampled in January 
and February of 2020, which could influence 
the precision of the phylogenetic topology 
and tMRCA inference. We lack direct evidence 
of a virus closely related to SARS-CoV-2 in 
nonhuman mammals at the Huanan market 
or its supply chain. The genome sequence of a 
virus directly ancestral to SARS-CoV-2 would 
provide more precision regarding the timing 
of the introductions of SARS-CoV-2 into humans 
and the epidemiological dynamics before its dis- 
covery. Although we simulated epidemics across 
a range of plausible epidemiological dynamics, 
our models represent a time frame before the as- 
certainment of COVID-19 cases and sequencing 
of SARS-CoV-2 genomes and thus before when 
these models could be empirically validated. 


Discussion 


The genomic diversity of SARS-CoV-2 during 
the early pandemic presents a paradox. Line- 
age A viruses are at least two mutations closer 
to bat coronaviruses, indicating that the an- 
cestor of SARS-CoV-2 arose from this lineage. 
However, lineage B viruses predominated 
early in the pandemic, particularly at the 
Huanan market, indicating that this lineage 
began spreading earlier in humans. Further 
complicating this matter is the molecular clock 
of SARS-CoV-2 in humans, which rejects a 
single-introduction origin of the pandemic 
from a lineage A virus. We resolved this para- 
dox by showing that early SARS-CoV-2 genomic 
diversity and epidemiology are best explained 
by at least two separate zoonotic transmissions, 
in which lineage A and B progenitor viruses 
were both circulating in nonhuman mam- 
mals before their introduction into humans 
(figs. S30 and S31). 

The most probable explanation for the intro- 
duction of SARS-CoV-2 into humans involves 
zoonotic jumps from as-yet-undetermined, 
intermediate host animals at the Huanan 
market (34, 38, 39). Through late 2019, the 
Huanan market sold animals that are known 
to be susceptible to SARS-CoV-2 infection 
and capable of intraspecies transmission (40-42). 
The presence of potential animal reservoirs, 
coupled with the timing of the lineage B primary 
case and the geographic clustering of early cases 
around the Huanan market (39), support the 
hypothesis that SARS-CoV-2 lineage B jumped 
into humans at the Huanan market in mid- 
November 2019. 

In a related study (39), we show that the 
two earliest lineage A cases are more closely 
positioned geographically to the Huanan 
market than expected compared with other 
COVID-19 cases in Wuhan in early 2020, de- 
spite having no known association with the 
market. This geographic proximity is consistent 
with a separate and subsequent origin of lineage 
A atthe Huanan market in late November 2019. 
The presence of lineage A virus at the Huanan 
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market was confirmed by Gao et al. (43) from a 
sample taken from discarded gloves. 

The high extinction rate of SARS-CoV-2 trans- 
mission chains, observed in both our simu- 
lations and real-world data (44), indicates that 
the two zoonotic events that established line- 
ages A and B may have been accompanied by 
additional, cryptic introductions. However, 
such introductions could easily be missed, 
particularly if their subsequent transmission 
chains quickly went extinct or the introduced 
viruses had a lineage A or B haplotype. Failed 
introductions of intermediate haplotypes are 
also possible. Critically, we have no evidence of 
subsequent zoonotic introductions in late 
December leading up to the closure of the 
Huanan market on 1 January 2020. By then, 
the susceptible host animals that had been 
documented at the market during the previous 
months were no longer found in the Huanan 
market (34). 

Other coronavirus epidemics and outbreaks 
in humans—including SARS-CoV-1, Middle East 
respiratory syndrome coronavirus (MERS-CoV), 
and most recently, porcine deltacoronavirus 
in Haiti—have been the result of repeated intro- 
ductions from animal hosts (45-47). These re- 
peated introductions were easily identifiable 
because human viruses in these outbreaks were 
more closely related to viruses sampled in the 
animal reservoirs than to other human viruses. 
However, the genomic diversity within the 
putative SARS-CoV-2 animal reservoir at the 
Huanan market was likely shallower than that 
seen in SARS-CoV-1 and MERS-CoV reservoirs 
(45, 46, 48). Hence, even though lineages A 
and B had nearly identical haplotypes, their 
MRCA likely existed in an animal reservoir. 
The ability to disentangle repeated introduc- 
tions of SARS-CoV-2 from a shallow genetic 
reservoir has previously been shown in the 
early SARS-CoV-2 epidemic in Washington 
state, where two viruses, separated by two 
mutations, were independently introduced 
from, and shared an MRCA in, China (figs. 
$23 and S30 and supplementary text) (7). 

Successful transmission of both lineage A 
and B viruses after independent zoonotic events 
indicates that evolutionary adaptation within 
humans was not needed for SARS-CoV-2 to 
spread (49). We now know that SARS-CoV-2 
can readily spread after reverse-zoonosis to 
Syrian hamsters (Mesocricetus auratus), Ameri- 
can mink (Neovison vison), and white-tailed 
deer (Odocoileus virginianus), indicating its 
host generalist capacity (50-55). Furthermore, 
once an animal virus acquires the capacity for 
human infection and transmission, the only 
remaining barrier to spillover is contact be- 
tween humans and the pathogen. Thereafter, a 
single zoonotic transmission event indicates 
that the conditions necessary for spillovers have 
been met, which portends additional jumps. 
For example, there were at least two zoonotic 


jumps of SARS-CoV-2 into humans from pet 
hamsters in Hong Kong (55) and dozens from 
minks to humans on Dutch fur farms (52, 53). 

We show that it is highly unlikely that 
SARS-CoV-2 circulated widely in humans ear- 
lier than November 2019 and that there was 
limited cryptic spread, with at most dozens of 
SARS-CoV-2 infections in the weeks leading 
up to the inferred tMRCA, but likely far fewer. 
By late December, when SARS-CoV-2 was iden- 
tified as the etiological agent of COVID-19 
(8), the virus had likely been introduced into 
humans multiple times as a result of persistent 
contact with a viral reservoir. 


Materials and methods summary 


Materials and methods described in full detail 
can be found in the supplementary materials. 


Sequence data 


We queried the GISAID database (56), Gen- 
Bank, and National Genomics Data Center of 
the China National Center for Bioinformatics 
(CNCB) for complete high-coverage SARS-CoV-2 
genomes collected by 14 February 2020, result- 
ing in a dataset of 787 taxa belonging to lineages 
Aand B and 20 taxa with C/C or T/T haplotypes. 
Genomes were aligned by using MAFFT v7.453 
(57) to the SARS-CoV-2 reference genome 
(Wuhan/Hu-1/2019), and 388 sites were masked 
at the 5’ and 3’ ends and at sites based on 
De Maio et al. (58). All genome accessions are 
available in data S1 and S2. 


Progenitor genome reconstruction and 
reversion analysis 


We reconstructed the progenitor of SARS-CoV-2, 
the the recCA. We (i) inferred a maximum 
likelihood tree of 31 sarbecovirus genomes 
(SARS-CoV-2 and 30 closely related sarbeco- 
viruses sampled from bats and pangolins) 
across 15 predefined nonrecombinant regions 
(13) with IQ-TREE v2.0.7 (59), (ii) inferred the 
sequence of the ancestor of SARS-CoV-2 in 
each tree with TreeTime v0.8.1 (60), and (iii) 
concatenated the resulting sequences. We next 
inferred a maximum likelihood tree of the 
787 SARS-CoV-2 taxa with IQ-TREE and per- 
formed ancestral state reconstruction with 
TreeTime to identify substitutions that were 
reversions from Wuhan-Hu-1 to the recCA 
across the SARS-CoV-2 phylogeny. 


Phylodynamic inference and epidemic simulations 


We performed phylodynamic inference using 
BEAST v1.10.5 (67) with the 787-taxon dataset 
to infer the ancestral haplotype and the tMRCA 
of SARS-CoV-2 (and the tMRCAs of lineages 
A and B), using a nonreversible random- 
effects substitution model and exploring un- 
constrained rooting, recCA-rooting, fixing the 
ancestral haplotype as a root, and outgroup 
rooting. SARS-CoV-2-like epidemics were 
simulated with FAVITES-COVID-Lite v0.0.1 
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(22, 62) using a scale-free network of 5 million 
individuals and a customized extension of 
the SAPHIRE model (63), producing coales- 
cent trees on which we simulated mutations. 
We calculated the BF comparing the support 
of two introductions of SARS-CoV-2 with 
one introduction by considering the posterior 
probabilities of the four most likely ancestral 
haplotypes from the phylodynamic inference 
(lineage A, lineage B, C/C, and T/T), the fre- 
quencies of the phylogenetic structures asso- 
ciated with introductions of these haplotypes 
in the epidemic simulations, and equal prior 
probabilities for each ancestral haplotype and 
the number of introductions. 

We connected the phylodynamic inference 
and epidemic simulations by means of a 
rejection sampling-based approach (23), ac- 
counting for the tMRCAs of lineages A and B 
and the earliest documented COVID-19 illness 
onset and hospitalization dates. We then in- 
ferred the timing of the introductions of 
lineages A and B and the infections and hos- 
pitalizations for each lineage. The proportion 
of epidemic simulations that went extinct (no 
onward transmission by the end of the simu- 
lation) was used to approximate the number 
of SARS-CoV-2 introductions needed to result 
in two introductions with sustained onward 
transmission. 
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A sustainable mouse karyotype created by 
programmed chromosome fusion 


Li-Bin Wang’?3+, Zhi-Kun Li??*+, Le-Yun Wang?3+, Kai Xu*?3+, Tian-Tian Ji**+, Yi-Huan Mao?2%, 
Si-Nan Ma?*, Tao Liu®, Cheng-Fang Tu®, Qian Zhao®, Xu-Ning Fan®, Chao Liu’, Li-Ying Wang", 
You-Jia Shu®*, Ning Yang", Qi Zhou’?**, Wei Lit?3* 


Chromosome engineering has been attempted successfully in yeast but remains challenging in higher 
eukaryotes, including mammals. Here, we report programmed chromosome ligation in mice that resulted 
in the creation of new karyotypes in the lab. Using haploid embryonic stem cells and gene editing, 

we fused the two largest mouse chromosomes, chromosomes 1 and 2, and two medium-size 
chromosomes, chromosomes 4 and 5. Chromatin conformation and stem cell differentiation were 
minimally affected. However, karyotypes carrying fused chromosomes 1 and 2 resulted in arrested 
mitosis, polyploidization, and embryonic lethality, whereas a smaller fused chromosome composed of 
chromosomes 4 and 5 was able to be passed on to homozygous offspring. Our results suggest the 
feasibility of chromosome-level engineering in mammals. 


he laboratory house mouse (Mus musculus) 

has maintained a standard 40-chromosome 

karyotype after more than 100 years of 

artificial breeding (7). Over longer time 

scales, however, karyotype changes caused 
by chromosome rearrangements are common: 
Rodents have 3.2 to 3.5 chromosome rearrange- 
ments per million years, whereas primates have 
1.6 chromosome rearrangements per million 
years (2). In humans (2n = 46, where n is a single 
set of chromosomes), the metacentric chromo- 
some 2 was formed by the Robertsonian (Rb) 
fusion of two acrocentric chromosomes that 
remain separate in Gorilla gorilla (2n = 48) (3). 
A reciprocal translocation between ancestor 
human chromosomes 5 and 17 produced chro- 
mosomes 4 and 19 in the gorilla (4). Rb fusion 
or reciprocal translocation can also cause an- 
euploidy, uniparental disomy, or childhood 
leukemia (5-7). 

Using embryonic stem cells (ESCs) and the 
Cre-loxP system, researchers have attempted to 
derive mouse models with programmed chro- 
mosome rearrangements, but only subchro- 
mosomal rearrangements have been achieved 
(8). Recent advances in genome editing have 
greatly facilitated chromosome engineering in 
haploid yeast (9-11). In mammals, yeast-like 
haploid ESCs (haESCs) were derived first from 
unfertilized mouse embryos and then from 
rat, monkey, and human counterparts (12-16). 
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However, genomic imprinting is frequently 
lost in haESCs, limiting their pluripotency and 
potential for genetic engineering (17-19). 

We recently discovered that by deleting three 
imprinted regions, we could establish a stable 
sperm-like imprinting pattern in haESCs (20). 
Because they have yeast-like haploidy and 
passage-persistent pluripotency, we used these 
cells in this study to test the feasibility of 
chromosome engineering in mammals. To 
ligate the entire arms of two nonhomologous 
mouse chromosomes into one, we designed a 
strategy that combined Rb fusion and reciprocal 
translocation. We wished to address whether 
we could ligate chromosomes in mammalian 
cells. We also examined how it would affect 
stem cell differentiation and chromatin orga- 
nization and to what extent it would affect 
mouse phenotypes. 


Results 
Chromosome ligation in mouse haESCs 


We chose to ligate two medium-size mouse 
chromosomes (chromosomes 4 and 5) head 
to tail (Chr4+5; Fig. 1A) and the two largest 
mouse chromosomes (chromosomes 1 and 2) 
in opposite orientations (Chr1+2 and Chr2+1; 
Fig. 1A). Telomere and centromere neighboring 
single-guide RNAs (sgRNAs) with cleavage effi- 
ciencies greater than 0.17 were used to generate 
double-strand breaks (DSBs) in these chromo- 
somes (tables S1 and S2). New haESC lines were 
established and used before passage 15. After 
cotransfecting sgRNA- and Cas9-expressing 
plasmids into haESCs, we used polymerase 
chain reaction (PCR) to genotype the cells for 
the desired editing results (table S3). Positive 
outcomes were identified in 0.69 to 1.4% of 
transfected cells (fig. S1, A to C). Sanger se- 
quencing analysis revealed bivalent endpoint 
sequences of targeted chromosomes in which 
nucleotide deletions and insertions were ob- 
served (fig. S1, D to F), indicating interchro- 


mosomal DNA repair by nonhomologous end 
joining after CRISPR-Cas9-mediated cleavages. 

Fluorescence in situ hybridization (FISH) 
was also used to confirm Chr4+5 and Chr2+1 
ligation in haESCs (Chr4+5 haESCs and Chr2+1 
haESCs; Fig. 1B). However, Chr1+2 in haESCs 
(Chr1+2 haESCs) had been split into two. The 
first part was a segment of chromosome 1 fused 
with chromosome 2 (Fig. 1B), and the second 
part was the remaining chromosome 1 fused 
with an arm of chromosome 17 (fig. S1G), as 
indicated by standard G-banding karyotype 
analyses (Fig. 1C). We found ligated chromo- 
somes in karyotype results of replicated ex- 
periments as well (fig. S1, H to K). All ligated 
chromosomes exhibited complete centromere 
and telomere signals (fig. S2A). We also found 
microsegments excised from targeted chromo- 
somes that lacked either telomere or centro- 
mere signals, except in one Chr4+5 line where 
a microchromosome that possessed complete 
centromere and telomere signals was found, 
indicating a ligation of two microsegments 
(fig. S2A). All microsegments and microchro- 
mosomes disappeared after passage 20. 

Continuous sorting has been shown to be 
required to maintain haploidy and avoid spon- 
taneous diploidization in mammalian haESCs 
(12-16). After sorting, two Chr4+5 haESC lines, 
one Chr1+2 haESC line, and two Chr2+1 haESC 
lines were established. Note that the established 
Chri+2 haESC line was one with a split Chr1+2, 
which indicated that, in comparison to a com- 
plete Chri+2, the split Chrl+2 might be ad- 
vantageous for the maintenance of haploidy 
in chromosome-engineered haESCs. Although 
the appearance and marker-gene expression 
of engineered haESCs were normal (fig. S2, B 
and C), DNA content analyses revealed signif- 
icantly reduced percentages of In cells in Chr2+1 
haESCs (fig. S2, D and E). Confocal microscopy 
analysis further revealed the existence of lagging 
chromosomes that could overlap with one 
another in dividing Chr2+1 haESCs that main- 
tained haploidy or spontaneously diploidized 
(fig. S2, F to I). We sorted the cells exhibiting 
4n DNA content in each line for ploidy anal- 
ysis, and those derived from Chr2+1 haESCs 
showed the highest proportion of authentic 
polyploidy (24.5%) (fig. S2J). 

Hi-C analyses revealed strengthened contacts 
between ligated chromosomes in engineered 
haESCs (Fig. 1D and fig. S3, A to C). We also 
found increased contacts between the split 
Chr1+2 and chromosome 17 in Chr1+2 haESCs 
(Fig. 1D). Using this feature, we located the 
splitting site of Chrl+2 at about 114.3 Mb in 
chromosome 1, whose proximal arm was fused 
with chromosome 17 (Chr1+17), thus exhibit- 
ing strengthened interchromosomal contacts 
(fig. S3D). Although the biological function of 
interchromosomal contacts in animal cells re- 
mains unknown (27), these data indicate that 
they could be strengthened by chromosome 
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Scale bars are 5 um. (C) Standard G-banding karyotyping results for Chr4+5, Chr1+2, 
and Chr2+1 haESCs. Ligated chromosomes are indicated by red arrows and text. 

(D) Contact maps of Chr4+5 (n = 2), Chrl+2 (n = 2), and Chr2+1 (n = 2) haESCs. 
Arrows indicate increased interchromosomal contacts. Numbers on the left indicate the 


chromosome; numbers on the right represent contact values. 


Fig. 1. Engineered chromosome ligation in mouse haESCs. (A) Diagrams for 
ligation of chromosomes 4 and 5 (Ch4+5), chromosomes 1 and 2 (Chr1+2), and 
chromosomes 2 and 1 (Chr2+1). (B) FISH detection for ligated chromosomes in Chr4+5 
(n = 10), Chr1+2 (n = 6), and Chr2+1 (n = 7) haESCs. The chromosome indicated 

by an arrow is further illustrated in fig. SIG. WT haESCs (n = 14) were used as controls. 


vided by the 114.3-Mb splitting site were iden- 
tified (fig. S3F). All chromosome-ligated haESCs 
exhibited transcriptomes similar to those of 
wild-type (WT) counterparts (fig. S3G). 


PacBio sequencing analyses were then used 
to identify ligations and structural variants 
(SVs) in each engineered haESC line (fig. S4, 
Ato C). Derived from parallel subclones, Chr1+2 


ligation. Ligated chromosomes exhibited a 
limited compartment switch, except chromo- 
some 1 in Chr1+2 haESCs (fig. S3E), on which 
two opposite compartment switch modes di- 
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Fig. 2. Ligation-induced 
polyploidization in 
diploid NSCs. (A) Hoechst 
staining shows overlapped 
lagging chromosomes in 
engineered NSCs (white 
arrows). Scale bars are 

5 um. (B) Percentage of 
cells containing overlapped 
lagging chromosomes in 
Chr4+5, Chr1+2, and 
Chr2+1 NSC lines. 

(C) Ploidy of NSCs 
carrying ligated chromo- 
somes (n = 3). The sorting 
of cells with DNA content 
equal to 2n or >4n is 
shown in the middle. Ploidy 
analyses for 2n cells on 
day 20 are shown on 

the left. All cells were 
synchronized with colchi- 
cine for 30 hours before 
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own on the right. Arrows indicate ligated chromosomes. 


Scale bars are 5 um. (D) Chromosome numbers of unsorted WT (n = 32), Chr4+5 (n = 22), Chrl+2 (n = 24), and Chr2+1 (n = 29) NSCs. The arrow indicates a 
fraction of polyploid Chr2+1 NSCs. Chromosome numbers >4n in Chr2+1 NSCs or >2n in other NSC groups represent overlapped karyotypes. For all graphs, data are 
means + SEM. **p < 0.01; ****p < 0.0001; ns, not significant. 


haESCs and Chri1+2 haESCs (sc-2; sc, subclone) 
exhibited identical ligation and distinct SVs 
(table S4), implying a random occurrence of 
the latter. Discontinuous reads mapping to 
114.3 Mb of chromosome 1 were found in Chr1+2 
haESCs and Chr1+2 haESCs (sc-2) and shared a 
14-base pair (bp) AT-rich endpoint sequence 
that indicated the precise splitting site (fig. 
S4D). Because the 14-bp nucleotides or their 
neighboring sequences did not match the 
sequences of sgRNAs, the split likely resulted 
from random microhomology-mediated end 
joining, which usually leaves 5- to 25-bp AT- 
rich endpoint sequences (22). Average SV sizes 
were 366, 166, and 558 bp in Chr4+5, Chr1+2, 
and Chr2+1 haESCs, respectively (fig. S4E). 
No sgRNA targeting site was found near any 
identified SVs (table S4). Moreover, no corre- 
lation between SVs and neighboring gene 
expression was found in any chromosome- 
engineered haESC line (fig. S4, F and G). 


Mitotic nuclear division arrested by large 
chromosome ligation 


Based on a combination of Hi-C and PacBio 
sequencing, the arm length was 308.3 Mb for 
Chr4+-5 (156.5 Mb of chromosome 4 plus 1518 Mb 
of chromosome 5; fig. S3A), 377.6 Mb for Chr2+1 
(182.1 Mb of chromosome 2 plus 195.5 Mb of 
chromosome ]; fig. S3B), 263.3 Mb for Chr1+2 
(81.2 Mb of distal chromosome 1 plus 182.1 Mb 
of chromosome 2; fig. S3C), and 209.6 Mb 
for Chr1+17 (114.3 Mb of proximal chromo- 
some 1 plus 95.3 Mb of chromosome 17; fig. 
S3D). Chr2+1 haESCs exhibited overlapping 
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lagging chromosomes and a high tendency 
toward polyploidization. However, the credi- 
bility of these observations was compromised 
by the spontaneous diploidization tendency 
of haESCs. Following a three-step differenti- 
ation procedure (fig. S5A), we tried to obtain 
diploid neural stem cells (NSCs) from engi- 
neered haESCs. Embryoid bodies and NSCs 
were successively derived and possessed normal 
appearances (fig. S5B). NSCs also exhibited 
normal marker-gene expressions, maintained 
ligated chromosomes, and had a diploid karyo- 
type (fig. S5, B to D). Hi-C results revealed 
strengthened contacts between ligated chro- 
mosomes (fig. S6, A to E). Anaphase lagging 
chromosomes were found in 12.2, 0, and 73.9% 
of NSCs carrying Chr4+5, Chr1+2, and Chr2+1, 
respectively (Fig. 2, A and B). Chr2+1 NSCs ex- 
hibited a high tendency toward polyploidization 
(Fig. 2C). For sorted cells (DNA content =4n), 
only those from Chr2+1 NSCs showed authentic 
tetraploidy (Fig. 2C). For unsorted cells, Chr2+1 
NSCs exhibited a marked polyploidized fraction 
not found in other groups (Fig. 2D). 

Next, we designed a programmed trans- 
location between Chr1+2 and Chr1+17 in Chri+2 
haESCs, aiming to recover full length Chri+2 
(recovered Chr1+2; Fig. 3A). In parallel, we tried 
to shorten Chr2+1 to the size of Chri+2 using a 
programmed translocation with chromosome 
17 (truncated Chr2+1 and Chr2+17; Fig. 3A). 
sgRNAs were designed according to the splitting 
site (fig. S4D and table S1). As a result, 0.10 to 
0.42% of clones were PCR-positive (n = 4; 
table S3). Sanger sequencing and karyotype 


analysis confirmed the formation of recovered 
Chr1+2 (recovered Chr1+2 haESCs) and trun- 
cated Chr2+1 and Chr2+17 (truncated Chr2+1 
haESCs; fig. S7, A to D). 

Spontaneously diploidized ESCs in each 
haESC line were sorted and labeled with a 
genome-integrating H2B-RFP-PiggyBac plas- 
mid to show chromosome behavior. Normal 
nuclear division was observed in 82 WT, 131 
Chr4+5, and 56 Chr1+2 ESCs. By contrast, 9 of 
112 Chr2+1 ESCs polyploidized during the 
imaged cycle. We found a continuous exis- 
tence of overlapped lagging chromosomes in 
all Chr2+1 ESCs that eventually polyploidized, 
accompanied by arrested anaphase and re- 
fusion of daughter nuclei (Fig. 3B). Although 
lagging chromosomes were also observed in 
Chri+2 and Chr4+5 ESCs at certain time 
points, they detached from one another with 
spindle elongation (Fig. 3B). In ESCs that car- 
ried recovered Chr1+2, a continuous existence 
of overlapped lagging chromosomes and refu- 
sion of daughter nuclei were found in 8 of 67 
observed cells (Fig. 3B). None of 87 ESCs that 
carried truncated Chr2+1 polyploidized during 
the imaged cycle (Fig. 3B). These results show 
that the arm of ligated chromosome Chr2+1 or 
recovered Chr1+2 (both 377.6 Mb in size) was 
spatially incompatible for diploid mouse cells, 
leading to arrested nuclear division and cell 
polyploidization; shortening the arm size by 
114.3 Mb could eliminate this effect. 

Refusion of daughter nuclei was slower than 
normal nuclear division (Fig. 3B). Consistent with 
this observation, we found a lower proliferation 
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with chromosome 17 and 
Chr1+17 formed in the 

meantime in Chr1+2 B 
haESCs, (IV) recovered 
Chr1+2 and Chr1+17 in 
translocated haESCs, (V) 
Chr2+1 and chromosome 
17 in Chr2+1 haESCs, and 
(VI) truncated Chr2+1 and 
Chr2+17 forms in the 
meantime in another 
translocated haESC. 

(B) Continuous nuclear 
imaging of diploid ESCs: 
(WT) diploid ESCs with a 
wild-type karyotype, 

(I) diploidized ESCs sorted 
from Chr4+5 haESCs, 

(Ill) diploidized ESCs 
sorted from Chr1+2 
haESCs, (IV) diploidized 
ESCs sorted from recov- 
ered Chr1+2 haESCs, (V) 
diploidized ESCs sorted 
from Chr2+1 haESCs, and 
(VI) diploidized ESCs 
sorted from truncated 
Chr2+1 haESCs. The time 
point when all chromo- 
somes are arranged in the 
equatorial plate (meta- 
phase) is set as O min. 


rate in Chr2+1 haESCs (fig. S7E). In truncated 
Chr2+1 haESCs with normal nuclear division 
(Fig. 3B), the proliferation rate was recovered 
(fig. S7E). Sub-G, phase proportion analysis 
indicated that the cell death ratio did not 
change after the ligations (fig. S7F). Fitted 
Gaussian curves showed an extended S/G, phase 
in Chr2+1 haESCs (fig. S7G). 


Production of viable pups that carry 
ligated chromosomes 


To establish a proper imprinting pattern (20), 
we deleted three imprinted regions (H19, IG, 
and Rasgrfl1) in engineered haESCs (table S3). 
Through oocyte injection of derived haESCs, 
we generated 113 Chr4+5 embryos, 355 Chr1+2 
embryos, and 365 Chr2+1 embryos, which were 
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transferred into surrogate wombs (Fig. 4A). No 
full-term pup was derived from Chr2+1 em- 
bryos, which were abnormal and died before 
embryonic day 12.5 (E12.5; Fig. 4B and table 
85). By contrast, haESC-injected embryos with 
a WT karyotype achieved full term efficiently 
(fig. S8A and table S5). Chr2+1 embryos showed 
a significantly increased percentage of polyploid 
cells (fig. S8, B to D). 

Fourteen and 37 full-term pups were derived 
from Chr4:+5 and Chri1+2 embryos (Chr4+5 and 
Chr1+2 pups), respectively (Fig. 4B and table 
S5). Chromosome ligations in these pups were 
confirmed (Fig. 4C and fig. S8A). Body weights 
of Chr4+5 (1.37 + 0.09 g, n = 13) and Chr1+2 
pups (1.35 + 0.15 g, n = 12) were normal (Fig. 4D) 
and so were their placenta weights (fig. SSE). 
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White arrows indicate overlapped lagging chromosomes; yellow arrows indicate lagging chromosomes detached by spindle elongation. Scale bars are 5 um. 


By transferring 68 embryos created by oocyte 
injection of truncated Chr2+1 haESCs, two 
full-term living pups were also derived (fig. S8F 
and table S5). 

Ligation-triggered polyploidization was con- 
firmed in haESCs that were diploidized in vitro 
(Fig. 2) but not in bona fide 2n ESCs derived 
from mouse embryos. We therefore obtained 
ESCs from E3.5 Chr4+5, Chrl+2, and Chr2+1 
embryos. Four out of 60 ESCs that were de- 
rived from Chr2+1 embryos exhibited contin- 
uously overlapping chromosomes during the 
imaged cell cycle, followed by arrested ana- 
phase and nuclear refusion (fig. S8G). All ESCs 
that were derived from Chr4+5 (n = 92), Chr1+2 
(n = 89), and truncated Chr2+1 (7 = 87) embryos 
exhibited normal nuclear divisions (fig. S8G). 
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Fig. 4. Production of mice that carry ligated chromosomes. (A) Strategy for 
generating mouse embryos that carry ligated chromosomes. 3KO, deletions 
of three imprinted regions; MII, metaphase II. (B) Full-term Chr4+5 and 
Chr1+2 pups and arrested E12.5 Chr2+1 embryos. Photographs are shown in 
the top row, and images for green fluorescent protein—positive signals are 
shown in the bottom row. The embryo with a heartbeat is labeled with an 
asterisk. Scale bars are 5 mm. (C) Standard G-banding karyotype results. Red 
arrows indicate ligated chromosomes. CN, chromosome number. (D) Body 
weights of WT (n = 12), Chr4+5 (n = 13), and Chrl+2 (n = 12) full-term 
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pups. (E) Adult Chr1+2 (bottom) and Chr4+5 (top) mice. The black 

mouse is the WT control. (F) Growth curves of WT (n = 11), Chr4+5 (n = 7), 
and Chrl+2 (n = 9) mice. (G) Activity traces of 8-week-old WT, Chr4+5, 

and Chrl+2 mice in the open-field test. The time periods along the top 
indicate 60, 120, 300, and 600 s after the start of test. (H) Entries of WT, 
Chr4+5, and Chr1+2 mice into the anxiety-provoking central area in the 
open-field test. (I) Velocities of WT, Chr4+5, and Chr1+2 mice in the 
open-field test. For all graphs, data are means + SEM. *p < 0.05; **p < 0.01; 
***n < 0.001; ns, not significant. 
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Fig. 5. Deriving homozygous offspring that carry ligated chromosomes. 
ve generations. Square, ma 
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gray, WT mice. (B) Comparison of the litter sizes 
and WT mice (n = 7). (C) Ratios of Chr4+5-7-, C 
offspring by mating Chr4+5*”~ mice (n = 18). (D 


Chr4+5 and Chr1+2 pups grew to adulthood 
(Chr4+5 and Chri+2 mice; Fig. 4E). Chr4+5 
mice had normal growth curves, but Chr1+2 
mice exhibited overgrowth at weaning (Fig. 
4F). In the open-field test for anxiety, Chr4+5 
mice entered the center zone at a normal 
rate, but Chr1+2 mice tended to avoid enter- 
ing the center zone (Fig. 4, G and H), indicat- 
ing a high level of anxiety (23). The moving 
distance and velocity of Chr4+5 mice were 
normal, whereas Chr1+2 mice moved signif- 
icantly less and slower (Fig. 4, H and I, and 
fig. S8, H and I). 
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Phenotype-associated gene dysregulation in 
chromosome-ligated mice 

We then compared the transcriptomes of the 
brain, lung, heart, liver, spleen, kidney, and 
muscle in Chr1+2 mice with those of the WT 
mice (7 = 2) and identified 2137 organ-specific 
and 50 shared differentially expressed genes 
(DEGs), 26.4% of which were located on chro- 
mosomes 1, 2, and 17 (fig. S9, A and B). Chro- 
mosome 17, accounting for 3.5% of the genome, 
had 7.7% organ-specific DEGs and 20% shared 
DEGs, suggesting a possible correlation between 


Chromosome number = 40 


its rearrangement and gene dysregulation. We 


e Chr4+5-containing axis is indicated by the arrow), 
merged images with Hoechst-stained DNA, and schemes for trivalent Chr4+5- 
Chr4-Chr5 and bivalent WT counterparts. Scale bars are 5 um. (G) Three 
hypothetical segregation patterns of trivalent Chr4+5-Chr4-Chr5 in Chr4+5*”- 
spermatocytes. (H) Comparing sperm mobilities in WT (n = 4) and C 
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results for ICSI embryos. Example G-banding karyotypes 


of ICSI embryos that fit hypothetical segregation patterns | and Il are shown on 
the left and middle. Karyotype distributions of 21 counted ICSI embryos 
on the right. The red arrow indicates a redundant chromosome 5. For a 
. **p < 0.001; ****p < 0.0001; ns, not significant. 
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analyzed the gene Capnii, which encodes a 
calcium-dependent protease and is located 
on chromosome 17. It was down-regulated 
in all organs (fig. S9C). CapniI was also down- 
regulated in Shank3-overexpressing mice, 
a well-characterized model for autism and 
schizophrenia (24). We therefore tested whether 
Capnil was related to the abnormal behavior of 
Chr1+2 mice. We deleted Capnil in WT C57 
mice (Capni1®° C57 mice; fig. S10, A to C). 
The mice avoided the center zone in the open- 
field test and exhibited increased body weight 
at weaning (fig. S10, D to F). These data 
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Fig. 6. Chromatin structure disturbances are weakened by differentiation. 
(A) Interaction values from chromosome 5 to chromosome 4 in Chr4+5 haESCs, 
NSCs, and brain. Results for WT haESCs, NSCs, and brain were used as controls. 
Arrows indicate boundaries for increased interactions. (B) Frequency of TAD 
alterations between WT and Chr4+5 haESCs, NSCs, and brain. (€) Metaplots 
for TAD boundaries in Chr4+5 haESCs [mean insulation score (IS) = 2266.48], 


suggested that Capnii dysregulation contrib- 
uted to the behavior phenotype of Chr1+2 mice. 

Reduced Capnil levels were also found in 
Chr1+2 haESCs and NSCs (fig. S11A). However, 
Hi-C analysis revealed no contact change 
at the Capni1 locus in these cells (fig. S11B). 
PacBio sequencing results revealed an intact 
Capnil locus in Chri+2 haESCs (fig. S11C). 
ATAC sequencing (assay for transposase- 
accessible chromatin using sequencing) results 
also showed no peak pattern change in Chr1+2 
haESCs and NSCs (fig. S1ID), and virtual 4C 
analyses revealed no contact pattern change at 
this locus (fig. SIIE). Inferred three-dimensional 
structure results showed that the normal fold- 
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transcription start site. 


ing pattern of chromosome 17 was maintained 
in Chr1+2 haESCs and NSCs (fig. SIF). A long 
terminal repeat (LTR) is located within the 
Capni1 locus (25), which lacked contact with 
other genome regions (fig. S11B). Deleting 
this LTR did not rescue the Capni/ levels (fig. S11, 
G and H, and table S3). Capnil levels were also 
down-regulated in truncated Chr2+1 haESCs 
but were restored in Chr2+1 haESCs and even 
in recovered Chr1+2 haESCs (fig. SI1H). Because 
chromosome 17 was fused with distinct seg- 
ments in Chr1+2 and truncated Chr2+1 haESCs, 
these data suggested the existence of a sequence- 
independent ligation-related effect on Capni1 
dysregulation (26). 


NSCs (mean IS = 2255.09), and brain (mean IS = 2254.16). Results for WT 
haESCs (mean IS = 2061.29), NSCs (mean IS = 2345.57), and brain (mean IS = 
2224.56) were used as controls. (D) Combined analyses illustrating altered gene 
expression and changed chromatin contacts in WT and Chr4+5 haESCs, NSCs, 
and brain. Genes with an expression fold change (FC) >2 are in red. TSS, 


Deriving homozygous chromosome-ligated 
mouse offspring 

No pup was derived from mating more than 
30 Chr1+2 mice. By contrast, Chr4+5 mice (Fo) 
produced full-term mice (F,) after mating with 
WT mice (Fig. 5A). After multiplying the num- 
ber of viable embryos (those devoid of H19 and 
IG deletions) in each litter by four, following 
the Mendelian ratio, the corrected litter size of 
Fo was 2.3 + 3.1(n = 7), which was significantly 
lower than that of WT counterparts (5.9 + 1.6, 
n = 7; fig. S12A). Of these derived F, mice, 
those carrying ligated chromosomes were 
identified and confirmed by PCR genotyping, 
Sanger sequencing, and karyotype analyses 


26 AUGUST 2022 * VOL 377ISSUE 6609 973 


RESEARCH | RESEARCH ARTICLES 


(fig. S12, B to E). Both female and male F, 
mice could transmit Chr4+5 by mating with 
the WT mice (fig. S12, F and G). We mated 
female and male heterozygous F, mice devoid 
of all imprinting deletions (Chr4+5*/ ~ mice), 
which also exhibited a reduced litter size (2.9 + 
1.8, n = 18; Fig. 5, A and B). Three homozygous 
mice that carried Chr4+5 (Chr4+5*/* mice) 
were derived. They had 19 chromosome pairs 
(Fig. 5, C to E). 

We also derived ESCs from E3.5 Chr4+5*/* 
mouse embryos (Chr4+5*/* ESCs). FISH detec- 
tion showed that chromosomes 4 and 5 were 
merged in Chr4+5*’* ESCs (fig. SI3A). By analyz- 
ing the organs of Chr4+5*/* mice, we found 
merged chromosomes 4 and 5 in all samples 
by Southern blotting (fig. S13B). Hi-C sequenc- 
ing results for Chr4+5°/* mouse brains (Chr4+5 
brains) revealed an increase in interchromo- 
somal contacts between chromosomes 4 and 
5 (fig. S13C), the levels of which were similar 
to those of intrachromosomal contacts within 
native chromosomes (fig. S13D). Together, 
these data indicated that Chr4+5 was homo- 
geneously retained in the cells and organs of 
Chr4+5*/* mice. 

A total of 3 Chr4+5*’* pups, 33 Chr4+5*/- 
pups, and 17 WT pups were obtained from 18 
litters of Chr445*/- offspring (Fig. 5C). The 
percentage of Chr4+5*/* pups (5.7%) was much 
lower than that of Chr4+57/~ pups (32.1%) or 
one half of Chr4+5*/~ pups (62.2%), which did 
not fit Mendel’s law for mating heterozygous 
parents (+/+:+/-:-/- = 1:2:1). Because Chr4+5 
shared homologous sequences with both chro- 
mosomes 4 and 5, potential explanations for 
the mating results could involve errors in 
pairing or segregation of chromosomes in 
Chra+5*/- germ cells. Using SYCP3 staining 
to show chromosome axes and FISH detec- 
tion to indicate chromosomes 4 and 5, we 
identified separated bivalents of chromosomes 4 
and 5 in WT pachytene spermatocytes (n = 18) 
and trivalents that exhibited both signals of 
chromosomes 4 and 5 in Chr4+5*/- pachytene 
spermatocytes (n = 30; Fig. 5F), indicating that 
Chr4+5 can correctly synapse with chromo- 
somes 4 and 5. We therefore propose three 
hypothetical segregation patterns for paired 
Chr4+5, Chr4, and Chr5 (Fig. 5G). To dissect 
the actual segregation outcome, we derived ma- 
ture sperm from Chr4+5*" mice (Chr4+5*/ 7 
sperm; movie S1). The percentage of Chr4a+5*/- 
sperm that exhibited high mobility (linear 
motion) was 9.7 + 5.2% (n = 7), which was 
lower than that of the WT counterpart (41.1 + 
5.8%, n = 4; Fig. 5H). To avoid bias for highly 
motile sperm, we used intracytoplasmic sperm 
injection (ICSI) to generate embryos (ICSI 
embryos) whose karyotypes could be used to 
deduce the chromosomal content of the injected 
sperm. We found embryos with karyotypes con- 
sistent with hypothetical segregation pattern I or 
II but not with segregation pattern ITI (Fig. 51). 
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Chromatin structure changes after engineered 
chromosome ligation 

We found an increased number of contacts 
between ligated chromosomes in Chr4+5 
haESCs, NSCs, and brain (fig. S8A, S6B, and 
$13D). In Chr4+5 haESCs, increased contacts 
of chromosome 5 to chromosome 4 were clus- 
tered on proximal chromosome 5 (0 to 40 Mb; 
Fig. 6A), and increased contacts of chromo- 
some 4 to chromosome 5 were clustered on 
distal chromosome 4 (140 to 160 Mb; fig. S14:A). 
A similar but minor contact change was found 
in Chr4+5 NSCs, whereas the minimal change 
was found in Chr4+5 brain (Fig. 6A and fig. 
S14A). These findings were confirmed by bio- 
logical replicates that exhibited correlation 
values ranging from 0.961 to 0.979 (fig. SI5A). 
Topological associated domain (TAD) scores 
indicated enhanced TAD compactness in Chr4+5 
NSCs and brain (fig. S15B). The percentages 
of changed TADs were 35.8, 31.3, and 21.0% in 
Chr4+5 haESCs, NSCs, and brain, respectively, 
exhibiting no prominent distribution on chro- 
mosomes 4 and 5 (Fig. 6B and fig. S15C). For 
those on chromosome 4, changed TADs were 
randomly scattered across the entire arm (fig. 
S15D). Meta-TAD borders were strengthened 
in Chr4+5 haESCs, weakened in Chr4+5 NSCs, 
and maintained in Chr4+5 brain (Fig. 6C). We 
identified 58, 1595, and 418 DEGs in Chr4+5 
haESCs, NSCs, and brain, respectively, which 
were not clustered on chromosomes 4 and 5 
(fig. S16, A and B). Changed contacts were not 
clustered at the transcriptional start sites of 
DEGs (Fig. 6D and fig. S16C). By analyzing 
genes that were paired by the same changed 
contacts, we found no correlation between 
their expression fold changes (fig. S16D). DEGs 
within changed contacts were enriched in 
pathways for exocytosis in haESCs, urogenital 
system development in NSCs, and axonogenesis 
in brain (fig. SI6E). 


Discussion 


In this study, we created laboratory mouse 
models that carried chromosome level fu- 
sions by engineering. Some engineered mice 
showed abnormal behavior and postnatal 
overgrowth, whereas others exhibited decreased 
fecundity, suggesting that although the change 
of genetic information was limited, fusion of 
animal chromosomes could have profound 
phenotypic effects. Capni1, which is located 
on arearranged chromosome, might have con- 
tributed to the phenotypes. Capnii dysregulation 
could have arisen by means of a sequence- 
independent effect associated with the re- 
arrangements (26). Changes in TADs and 
interchromosomal contacts in chromosome- 
ligated mice were similar to findings with 
natural Rb mice (27), suggesting that our work 
could help our understanding of evolutionar- 
ily derived chromosome fusions. Ligating the 
two largest mouse chromosome arms led to 


refusion of daughter nuclei, consequently to 
cell polyploidization, and finally to embryonic 
lethality, but all of these effects were elimi- 
nated when the ligated arms were truncated 
by two independent translocations. This evi- 
dence suggests that the physical space of the 
mitotic nucleus is a potential constraining factor 
in mammalian karyotype evolution (28, 29). 

Reproductive isolation and formation of new 
species may arise through accumulating chro- 
mosomal rearrangements that reduce fertility 
in heterozygous hybrids (30, 37). Chr1+2 mice 
(carrying two rearrangements) did not produce 
offspring, but Chr4+5 mice (carrying one re- 
arrangement) did, although with limited fecun- 
dity. By analyzing the spermatocytes of Chr4+5 
mice, we pinpointed the reproduction barrier 
to a segregation error of ligated chromosomes, 
which could attribute the reduced fecundity 
to impaired mobility of aneuploid sperm or 
the developmental failure of aneuploid em- 
bryos. With a lower birth rate, homozygous 
Chr4+5 mice were derived by mating heter- 
ozygous parents, suggesting that one fusion 
was insufficient for reproductive isolation in 
mice. Using an imprint-fixed haESC plat- 
form and gene editing, we achieved germline- 
transmittable chromosome ligation in a widely 
used animal model, the house mouse, which 
highlights a potential route for large-scale 
engineering of endogenous or exotic DNA in 
mammals (32). 
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Growth rules for irregular architected materials with 


programmable properties 


Ke Liu’, Rachel Sun‘, Chiara Daraio’* 


Biomaterials display microstructures that are geometrically irregular and functionally efficient. Understanding 
the role of irregularity in determining material properties offers a new path to engineer materials with superior 
functionalities, such as imperfection insensitivity, enhanced impact absorption, and stress redirection. We 
uncover fundamental, probabilistic structure—property relationships using a growth-inspired program that 
evokes the formation of stochastic architectures in natural systems. This virtual growth program imposes a set 
of local rules on a limited number of basic elements. It generates materials that exhibit a large variation in 
functional properties starting from very limited initial resources, which echoes the diversity of biological 
systems. We identify basic rules to control mechanical properties by independently varying the 
microstructure’s topology and geometry in a general, graph-based representation of irregular materials. 


he properties of materials depend both 

on their chemical composition and on 

the geometry of their microstructures. 

Empowered by carefully engineered sub- 

scale microstructures, architected mate- 
rials (7-5) have been suggested for applications 
in optics (6), electromagnetics (7, 8), acoustics 
(9), and robotics (10-12). In mechanics (73), 
architected materials have been designed to 
exhibit negative thermal expansion (J4), nega- 
tive Poisson’s ratio (15), ultrahigh strength-to- 
weight ratio (16, 17), tunable failure load (18), 
vanishing shear modulus (19), and shear-normal 
coupling (20). To reduce the complexity of 
designing structures in a nearly infinite space, 
human-made architected materials are mostly 
designed by periodic tessellations of selected 
geometric motifs. These motifs are either de- 
rived empirically from a limited number of 
known geometries, such as biomaterials, crys- 
talline solids, and art (15, 16, 21), or com- 
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putationally generated within bounding boxes 
discretized into pixels or voxels (22-25). 
Materials with periodic microstructures are 
special cases in the realm of architected 
materials. Natural materials are usually char- 
acterized by irregular and heterogeneous 
microstructures, such as wood (26), nacre 
(27), insect nests (28) (Fig. 1A), or human 
bones (29). They present distinctive proper- 
ties, such as the exceptionally white scales 
of some beetles (30) or the functional stab- 
ility to perturbations of proteins (31). The 
geometric irregularity of biomaterials is a 
natural outcome of self-organized growth, 
which unfolds through a distributed, stoch- 
astic building process that follows simple 
local rules without a centralized plan (28). 
Understanding the independent role of 
geometry and topology in irregular micro- 
structures provides opportunities for the de- 
sign and fabrication of advanced engineering 
materials. However, current descriptions 
of geometry used for periodic systems lead 
to ambiguity in distinguishing the contribu- 
tion of specific structural features, or their 
repetition, on given functionalities. This 
underlines the importance of developing 


tools to define spatial characteristics in ir- 
regular materials. 

Recently, computational methods have been 
developed to design and characterize irregular 
microstructures (32-36). For instance, the de- 
sign of random, auxetic truss lattices revealed 
important connections between Poisson’s 
ratio and lattice connectivity (33, 34). How- 
ever, these tools do not provide a general frame- 
work to describe the geometry of architected 
materials, for example, because they do not 
include periodic designs in their descriptors. 


A virtual growth program for microstructure 
generation 


To better understand the structure-property 
relationships in irregular architected materials, 
we created a tool that evokes the distributed 
stochastic building process of natural growth, 
which we call the virtual growth program. The 
program is a graph-based method that builds 
on the combinatorial space of basic building 
blocks (Fig. 1B). These building blocks are local 
structural elements that can be identified in 
arbitrarily complex microstructures at a scale 
that is smaller than the typical unit cells in 
periodic designs. In the virtual growth process, 
the building blocks are connected stochasti- 
cally on an underlying network, in which each 
pair of neighbors abides prescribed adjacency 
rules (Fig. 1, C and D). In this framework, a 
material’s microstructure can be both periodic 
and nonperiodic. The framework also decouples 
topology (the connectivity of the underlying 
network) from the geometry (the shape of the 
building blocks) and allows investigating their in- 
dependent influence on global material properties. 

In this work, we use the virtual growth pro- 
cess to unravel structure-property relationships 
in irregular architected materials. We show that 
by starting from a very limited number of local 
structures (i.e., the building blocks), it is possible 
to generate a rich set of material microstruc- 
tures with a wide range of functional properties. 
Specific properties can be targeted, by selecting 
adjacency rules and building blocks availability 
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Sketches 


Building Blocks 


Fig. 1. Schematic of the virtual growth process of irregular architected materials. (A) Termite nests have irregular internal structures that are optimized for 
structural stability and ventilation (28). (B) Abstraction of the “growth” process, which assigns building blocks on an underlying graph. (€ and D) Illustration of the 


virtual growth process (C) in 2D (movie S1) and (D) i 


during “growth.” These findings provide insight 
into how to program material properties in 
stochastic, self-assembly processes, and may 
influence future manufacturing of engineer- 
ing materials. 

The virtual growth program relies on four 
major inputs, which serve as the genome for 
the generation of architected materials: (i) the 
topology of the underlying network, (ii) the 
geometry of building blocks, (iii) the adjacency 
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rules between building blocks, and (iv) the 
availability of (or frequency hint for) building 
blocks. The program can create materials with 
different microstructures (Fig. 2). For example, 
the same square network (Fig. 2A) can be used 
to accommodate different building blocks (Fig. 2, 
B to D), including their reflections and rota- 
tions (fig. SIA). The adjacency rules define 
whether and how the basic building blocks 
can pair with each other (fig. SIB) by enforcing 


n 3D (movie S2). The physical models in (C) and (D) are 3D printed. 


geometrical compatibility at the interface and 
avoiding unwanted geometric features. For 
example, in the case of Fig. 2B, we forbid two 
“L-shaped building blocks from connecting to 
avoid forming disconnected loops. The avail- 
ability of building blocks resembles natural 
resource limits and influences how many times 
each building block appears in the final design 
(fig. S1C). Infinite availability of building blocks 
leads to constant frequency hints throughout 
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Fig. 2. Irregular materials generated by the 
virtual growth program. (A) Typical output 
of the virtual growth program, which is a 
symbolic graph. The letters and numbers 

are indexes that refer to the basic building blocks 
and their orientations. (B) Lattice-like design, 
which is the focus of this article. The “—,” “T,” 
“L? and “+" symbols represent the building 
blocks in the box. (C) Spinodal pattern-like 
design. (D) Multimaterial composite. We note 
that the building blocks are not limited to 
square shapes as long as the interfaces 
between building blocks are compatible. 
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Fig. 3. Mechanical properties of the 2D irregular architected materials. ratio (Vayg) values for different sizes of materials samples as a function of the 
(A) Numerically evaluated Young's modulus (Eayg/Es) and average Poisson's dimension of the underlying networks. The first three groups are evaluated by 
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homogenization (SVE), and each point with error bars contains 100 samples. 
The (last) reference group contains 10 samples and is evaluated by a direct 
simulation, with boundary condition as shown in the inset of (D). The error bars 
extend minimal and maximal values. Num, numerical; Ref, reference. (B) Plot 
Of Eave/Es VS. Vavg for 11 sample groups generated by using different frequency 
hints, each containing numerical 100 samples. The insets use pie plots to show 
the resultant probabilities of appearance of the basic building blocks. Experiments 
are performed for seven groups, each with five samples. The error bars extend to 
one standard deviation. The arrows indicate trends of property changes. Exp, 


experimental. (C) Smoothed distributions of Vayg and Eayg/Es, based on the 
numerical samples. The color code follows (B). P, probability density function. 

(D to G) Representative designs and their experimental stress (—o)-strain (—e) 
curves under compression along both x and y directions (movie S3). The stresses 
(—o) are calculated as effective stress for the bulk volume, in units of megapascals. 
The stress and strains are effective values with respect to the bulk dimension of 
architected materials. The colors of the designs refer to the different sample groups. 
The inset shows the boundary conditions. The thin black lines show our definition of 
Young's modulus as a secant modulus between 0.005 and 0.015 strain. 


the “growth” process. Defects are likely to hap- 
pen when the availability of a certain building 
block is very low (fig. S1, D and E). To avoid 
defects, in the rest of this study, we assume that 
there is an infinite amount of building blocks 
available for each “growth” process. 

The virtual growth process (movie S1) imple- 
ments a WaveFunctionCollapse algorithm (37). 
In each step, the algorithm assigns a random 
building block to the node on a predefined 
network with minimal nodal entropy. Here, 
nodal entropy is related to the number of build- 
ing blocks that can be assigned to a given node. 
For example, if only one building block can be 
assigned to a given node to satisfy adjacency 


rules, then its nodal entropy is zero. If a node 
can be filled with any building block, its nodal 
entropy is maximal. When the algorithm can- 
not assign any building block to a node, a 
defect forms. This process continues until all 
nodes are assigned, and the nodal entropies 
are updated after each step. 


Clustering and convergence of material properties 


We constrain the underlying network to be a 
squared grid, without loss of generality, and 
use the building blocks in Fig. 2B and fig. S1. 
The nondeterministic assignment of building 
blocks leads to a diversity of architected ma- 
terials. Even given the same building blocks, 


adjacency rules, and frequency hints, the program 
generates different material microstructures 
every time. After generating the microstruc- 
tures, we evaluate their linear elastic proper- 
ties, Young’s modulus, and Poisson’s ratio in 
the x and y directions. To obtain these prop- 
erties, we perform numerical homogenization 
(38) using the statistical volume element (SVE) 
approach (39). The convergence of linear elas- 
tic properties is tested on three different sample 
sizes for the SVE and compared to the results 
of direct simulations on larger patches (40 by 
40 squared grid) of materials. As observed 
in Fig. 3A, when the SVEs are of grid size 20 
by 20, their properties are close enough to 
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Fig. 4. Decoupled effect of topology and geometry on material properties. 
(A) Ranges of properties covered by three different databases of samples, each 
obtained with different variants of building blocks. The dashed boundary of each 
cloud reaches to the extremal values of individual samples. (B to D) Zoom-in 

distribution of samples in each database. The pie plots are located at the mean 
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value of a group of 100 samples, with fractions of the pie showing probabilities of 
appearance of the corresponding building blocks. The insets show the 
geometries of the basic building blocks and their reference colors in the pie 
plots. (E) Typical designs from each of the three databases are shown, with the 
background colors matching the colors of the corresponding database. 
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A Max(Gy joie)/MAX(Gy, fui) = 3.30 B 
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Stiff, v...>0 Soft, v0 Soft, v..<0 


Fig. 5. Redirection of stresses and deformations. (A and B) Stress 
distribution in a piece of material compressed by a prescribed displacement, 
with and without the presence of a hole. (A) Piece of continuum material that 
has the same elastic properties as the homogenized properties of the irregular 
sample in (B). (B) Piece of irregular architected material. For all four cases in (A) 
and (B), the boundary condition and the color scale of Von Mises stress (ov) 
are shown on the left. Insets show a zoom-in view of stress near the hole. 


that of the large 40 by 40 samples. Therefore, 
for each particular set of inputs to the virtual 
growth program, we generate 100 material 
samples on a grid with 20 by 20 nodes and 
obtain the distribution of mechanical proper- 
ties by evaluating these 100 samples. 

We evaluate 11 groups of architected mate- 
rials generated by different frequency hints, 
but with the same basic building blocks and 
adjacency rules (Fig. 3B). The experimental 
samples are manufactured by three-dimensional 
(3D) printing that uses a stiff rubbery ma- 
terial [Semiflex, NinjaTek (38)]. In the exam- 
ples shown in Fig. 3B, the generated materials 
exhibit nearly tetragonal symmetry (not iso- 
tropic) with similar effective Young’s moduli 
and Poisson’s ratio when loaded along the x 
and y directions (38). Hence, we use their aver- 
age values, ie., E,,. (average effective Young's 
modulus) and v,y. (average Poisson’s ratio), to 
compare performance of different architected 
materials’ groups. To obtain a dimensionless 
measurement, Faye is normalized by the Young’s 
modulus of the constituent material (Zs). From 
the numerical samples (fig. S2), irregular ar- 
chitected materials of the same group tend 
to cluster together, in different patterns. The 
marginal distributions of E,ye and Vayg are 
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shown in Fig. 3C. The experimental samples also 
follow similar trends in properties’ distribution, 
in agreement with numerical simulations. 

To study the structure-property relation de- 
termined by the presence of different building 
blocks, we focus on analyzing the mean values 
of the clusters (Fig. 3B). We observe that the 
probabilities of appearance of different build- 
ing blocks have a distinctive impact on the 
mechanical properties. For example, a higher 
probability of the “T”-shaped building block 
yields a decreasing Poisson’s ratio toward neg- 
ative values but has minimal influence on the 
material’s average Young’s modulus. A higher 
probability of “+”-shaped building block yields 
a larger Young’s modulus, but it has negligible 
effects on the Poisson’s ratio. In addition, a 
higher probability of both “T”- and “+”-shaped 
building blocks leads to materials with a 
relatively high Young’s modulus and relatively 
large negative Poisson’s ratio, displaying an 
additive influence of building block probabil- 
ities on mechanical properties. Such trends are 
robust and remain consistent in both numerical 
and experimental results. We note that the re- 
sultant probabilities of appearance of the build- 
ing blocks in the generated material samples 
are slightly different from the input frequency 


max(oy, hole)/MAX( Gy, fut!) = 0.96 


MPa 


(C and D) Face that “smiles” under lateral compression, owing to its 
heterogeneous microstructures. (C) 3D printed structure before compression. 
The false color shades refer to regions generated by different frequency hints 
that lead to different mechanical properties. The zoomed-in views show the 
smooth transition between different regions of the microstructure. (D) Structure 
during compression. The right half shows the stress distribution from numerical 
simulation. The arrows show the direction of loading. 


hints. This is due to the constraints imposed by 
the adjacency rules, as compatibility require- 
ments override the frequency hints (fig. S3). 

We observe some hysteresis effects from 
the experimental stress-strain curves (Fig. 3, 
D to G). This is likely due to the constituent 
material’s viscoelasticity and large deformation- 
induced contacts and frictions between nearby 
elements (fig. S4). Nevertheless, we only focus 
on the linear regime of the experimental load- 
ing curves and extract the value of the Young’s 
modulus in a particular direction, as the secant 
modulus between 0.005 and 0.015 strain. We 
use a digital image correlation system to track 
the deformations and obtain the values of 
Poisson’s ratio (38). The discrepancies between 
the numerical and experimental results (Fig. 
3B) are possibly caused by imperfect boundary 
conditions (e.g., friction), manufacturing error, 
and local nonlinear effects. In particular, the 
group of samples with a high probability of the 
“—” building block (Fig. 3G) experiences strong 
nonlinear effects, as the long beams buckle 
immediately after being loaded. In fact, our 
experiments show that not only the linear elas- 
tic properties but also the nonlinear responses 
of the samples from the same group tend to 
behave similarly (fig. S3). 
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Fig. 6. Extension to 3D irregular microstructures. (A) Basic building blocks. 
(B) Three geometric variants of selected building blocks. (C) Ranges of 
properties covered by the generated architected materials. Each cloud 
corresponds to the database that was generated by using different variants of 
building blocks in (B). (D) Zoom-in distribution of samples in the first database. 
The pie plots are located at the mean value of a group of 100 samples, with 
fractions of the pie showing probabilities of appearance of the building blocks 


Construction of material databases 

The virtual growth program efficiently gen- 
erates materials that cover a wide range of 
linear elastic properties (Fig. 3). Hence, it can 
be used as a tool to explore the design and 
property space of architected materials by vary- 
ing inputs. We demonstrate how changing both 
the topology and geometry of material micro- 
structures (Fig. 4) results in three databases 
that contain 54,000 samples of architected 
materials. 
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The three clouds in different colors refer to 
the material samples that were generated by 
using three geometric variants of the building 
blocks. Each cloud consists of 180 groups of 
samples generated by 180 different combina- 
tions of frequency hints (38). The angles of 
the “T”-shaped and “L”-shaped building blocks 
are changed from an acute angle to a right 
angle and to an obtuse angle (Fig. 4, B to D). 
The red shaded cloud is occupied by the ma- 
terial samples that were generated by using 


06 O07 


within the bounding box of the corresponding color in (A). (E and F) Influence of 
the probability of appearance of certain basic building blocks (insets) on different 
mechanical properties. (G and H) Directional Young's modulus (E, normalized by 
Es, the Young's modulus of the constituent material) and shear modulus 

(G, normalized by Gs, the shear modulus of the constituent material) of the group 
of samples marked a diamond box in (D). (1) Digital rendering of a material sample 
in the marked group in (D). 


the first set of variants. Because these mate- 
rials are rich in the “T”-shaped building blocks 
with a re-entrant acute angle, they mostly ap- 
pear to be auxetic. As we change the geometries 
of the building blocks (Fig. 4, C and D), the 
range of the average Young’s modulus remains 
almost the same, but the Poisson’s ratio of the 
entire cloud shifts toward the positive range 
(Fig. 4A). An obvious negative correlation is 
observed between the average Poisson’s ratio 
and the probability of the appearance of the 
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“T”-shaped building block (fig. S5). In gen- 
eral, the growth rules and mechanical proper- 
ties present nontrivial yet clear correlations 
(fig. S5). Typical materials from each of the 
three clouds of samples are shown in Fig. 4E. 
Despite the different geometries, these three 
samples share the same topology because they 
have the same underlying network, only filled 
with different building blocks, similar to the 
examples in Fig. 2. 

With the virtual growth program, we can 
obtain a wide range of irregular, yet program- 
mable, architected materials. The programmable 
properties result from the nontrivial probability 
distribution of the stochastic topologies and geo- 
metries. The property space can be further 
expanded. For example, we can introduce di- 
rectional preferences of the building blocks, 
which drives the current nearly tetragonal elas- 
ticity to orthotropic. Moreover, by adding new 
building blocks, we can substantially improve 
the overall shear modulus of the generated ma- 
terials [see (38) and fig. S6 for elaboration]. 

One advantage offered by irregular mate- 
rials is that they offer redundant load paths: 
When one part of the material is damaged, 
the stress within the irregular architecture 
is redistributed through the complex micro- 
structural network. This redistribution ensures 
that the maximum stress anywhere within 
material remains almost the same, before and 
after damage, which prevents a cascading 
failure. We compare the stress distribution in 
a continuum and in an irregular architected 
material, before and after punching a hole in 
the sample (Fig. 5, A and B). Results of com- 
pression tests show that, although the uniform 
sample shows classical stress concentration 
near the hole, the irregular material shows no 
such stress concentration. Rather, the stress in 
the sample with a hole is redistributed through- 
out the entire sample without drastic varia- 
tions in peak stress, compared with peak stress 
values of the sample without a hole. 

Irregular microstructures can be designed 
to present heterogeneous distributions of local 
elastic properties (40). For nonperiodic archi- 
tected materials that are designed from a 
database of unit cells (24), tessellating differ- 
ent structures and constituent materials while 
ensuring connectedness and compatibility is 
challenging (25, 40, 42). By using the virtual 
growth program, designing materials with 
inhomogeneous properties is possible with a 
single, continuous process by assigning dif- 
ferent frequency hints to different regions of 
the sample. With this approach, connectedness 
and compatibility are automatically guaran- 
teed by the adjacency rules. For instance, 
we show how it is possible to design an in- 
homogeneous microstructure that can con- 
centrate deformations in selected areas of a 
sample. We highlight this ability by designing 
a “face” that “smiles” when being compressed 
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from the sides (Fig. 5, C and D). To change the 
deformation characteristics, we assigned dif- 
ferent frequency hints to the different regions 
on the “face” (Fig. 5C). These sets of frequency 
hints are extracted from our databases (Fig. 4 
and fig. S6). 

By defining 3D building blocks (Fig. 6A) and 
adjacency rules, the virtual growth program 
can be extended to produce 3D irregular ar- 
chitected materials. Similar to the 2D case, we 
constructed a database of 33,000 material sam- 
ples that were based on three different geo- 
metric variations on selected building blocks 
(Fig. 6B) and 110 different frequency hints (Fig. 
6, C and D). Each material sample is generated 
on a 10 by 10 by 10 cubic grid. Each building 
block is enclosed in a cube of size 5 mm by 
5 mm by 5 mm, and the lattice (beam) mem- 
bers are assumed to be circular, with a radius of 
1mm. We observe interesting correlations be- 
tween the probabilities of appearance of build- 
ing blocks and the mechanical properties (Fig. 
6, E and F, and figs. S7 to S9). The anisotropy 
of the generated materials can be seen from 
the directional Young’s modulus and shear 
modulus (Fig. 6, G and H) as a result of our 
particular selection of basic building blocks. A 
rendered image of a typical sample highlights 
the 3D irregular architecture (Fig. 61). 


Discussion and outlook 


We describe fundamental, probabilistic rules 
that control the overall mechanical response 
of irregular materials. Our approach establishes 
a general, graph-based representation of mate- 
rial microstructures, which we use to create 
architected materials with functionally graded 
properties and to demonstrate robustness 
against damage. In the future, the approach 
could be further extended to design materials 
with prespecified properties by incorporating 
optimization approaches in the selection of 
building blocks and/or in the adjacency rules 
for growth. The basic building blocks could 
also be selected to have more geometries (e.g., 
learned from data), different constitutive ma- 
terials, and dimensional scales (e.g., to realize 
hierarchical materials). The underlying graph, 
which in this work is represented as squared 
or cubic grids, can be extended to have more 
complex connectivity. Because the virtual growth 
program is independent from any particular 
material properties, it is readily applicable to 
discover nonlinear and multiphysical proper- 
ties of materials. 
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We present the first ancient DNA data from the Pre-Pottery Neolithic of Mesopotamia (Southeastern Turkey and Northern Iraq), Cyprus, and the Northwestern 
Zagros, along with the first data from Neolithic Armenia. We show that these and neighboring populations were formed through admixture of pre-Neolithic 
sources related to Anatolian, Caucasus, and Levantine hunter-gatherers, forming a Neolithic continuum of ancestry mirroring the geography of West Asia. 
By analyzing Pre-Pottery and Pottery Neolithic populations of Anatolia, we show that the former were derived from admixture between Mesopotamian-related 
and local Epipaleolithic-related sources, but the latter experienced additional Levantine-related gene flow, thus documenting at least two pulses of migration 
from the Fertile Crescent heartland to the early farmers of Anatolia. 


revious work has documented the exis- 

tence of highly differentiated Neolithic 

populations in ancient West Asia (7-9) 

and some of their pre-Neolithic ante- 

cessors in the Caucasus (JO), Iran (1, 1D), 
Anatolia (6), and the Levant (7). To anchor our 
integrative genomic history of the Southern 
Arc, aregion we define as including Anatolia 
and its neighbors in Southeastern Europe and 
West Asia (12), we sought to understand how 
the earliest Neolithic populations were formed, 
with a particular focus on the Pre-Pottery pe- 
riod of Northern (or Upper) Mesopotamia, the 
area between the Tigris and Euphrates rivers 
of Southeastern Turkey, Northwestern Iraq, 
and Northeastern Syria, within the Pre-Pottery 
Neolithic interaction sphere (73). Despite the 
centrality of Mesopotamia in the archaeolog- 
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ical record of the origin of farming (J4), no 
genome-wide ancient DNA data from early 
Mesopotamian farmers has been published. 
We used in-solution enrichment for ~1.2 million 
single nucleotide polymorphisms (SNPs) to 
study Pre-Pottery Neolithic farmers from the 
Tigris side of Northern Mesopotamia: one from 
Boncuklu Tarla near Mardin in Southeastern 
Turkey and two from Nemrik 9 in Northern 
Iraq. We also report the first Pre-Pottery Neo- 
lithic data from Cyprus, an island to the south 
of the Anatolian peninsula and west of the 
Levant, which witnessed the earliest mari- 
time expansion of Pre-Pottery farmers from 
the Eastern Mediterranean; our data come 
from three individuals whose fragmentary 
remains were found in a Neolithic disused and 
filled-in water well at Kissonerga-Mylouthkia 


(5). Furthermore, we report the first ancient 
DNA data from the Neolithic of Armenia, from 
two individuals buried at the sites of Masis 
Blur and Aknashen in the sixth millennium 
BCE. These individuals represent an inland 
Pottery Neolithic population, which we could 
compare to the Pre-Pottery one from Northern 
Mesopotamia to its south, the Pottery Neolithic 
one of Azerbaijan to its east (7), and later 
Chalcolithic individuals from Armenia (J). 
Finally, we sampled three Pre-Pottery Neolithic 
farmers from the Northern Zagros at Bestansur 
and the Zawi Chemi component of Shanidar 
cave in Iraq, who fill a gap between the more 
western and northern individuals and published 
data from the Central Zagros in Iran (1). 
Details of the newly sampled individuals can 
be found in (72), and their geographical and 
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temporal distributions can be seen in Fig. 1. To 
improve the statistical power of our analyses, 
we also increased data quality for a number 
of individuals with previously reported data, 
making and sequencing additional ancient DNA 
libraries from four Epipaleolithic Natufians 
from Israel, six Pre-Pottery Neolithic indi- 
viduals from Jordan (/), and nine Neolithic 
individuals from the Eastern Marmara re- 
gion (Northwest Anatolia, sites of Barcin and 


Mentese) (2). From Eastern Marmara, we also 


sampled an individual from Barcin and two 
from the previously unsampled site of Ilipinar. 
Individuals from the three sites were genetically 
similar, and we analyze them, together with 
later Chalcolithic individuals from the same 
site, in a study of later periods of Anatolia (72). 

We carried out principal components analysis 
(PCA) (16) (Fig. 2A), projecting the ancient indi- 
viduals onto the variation of present-day West 
Eurasians (17). Two main clusters emerge: an 


“Eastern Mediterranean” Anatolian/Levantine 


cluster that also includes the geographically 
intermediate individuals from Cyprus, and 
an “inland” Zagros-Caucasus-Mesopotamia- 
Armenia-Azerbaijan cluster. There is structure 
within these groupings. Anatolian individu- 
als group with each other and with those from 
Cyprus, whereas Levantine individuals are 
distinct. Within the inland cluster, individuals 
that are more geographically distant from 
the Mediterranean, such as those from the 
South Caucasus [Caucasus hunter-gatherers 
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Fig. 1. Studied individuals. (A) Time frame of Pre-Ne 
populations in West Asia. (B) Geographical location of 


from Georgia (J0) and Ganj Dareh from Cen- 
tral Zagros], are also genetically more distant 
as compared with the geographically and 
genetically intermediate individuals from 
Mesopotamia and Armenia/Azerbaijan. The 
Eastern Mediterranean and inland clusters are 
separated by a gap in Fig. 2A, which may cor- 
respond to geographically intermediate areas 
between sampling locations, for example, the 
Euphrates region of North Mesopotamia. The 
totality of Neolithic West Asia is enclosed within 
the range of variation of the quadrangle formed 
by Caucasus hunter-gatherers, Ganj Dareh, 
Levantine Natufians (/) from Israel, and Epi- 
paleolithic Pmarbasi (6) from Central Anatolia. 

In a linked study, we developed a mathe- 
matical framework for estimating the ancestry 
proportions of individuals of the entire South- 
ern Arc across space and time with a common 
metric (12), and here we discuss the results 
of applying this model to the Neolithic period 
(Fig. 2B). This model includes Caucasus hunter- 
gatherers (J0), Eastern European hunter- 
gatherers (2, 18), Levantine Pre-Pottery Neolithic 
(1), Balkan hunter-gatherers from the Iron 
Gates in Serbia (79), and Anatolian Neolithic 
[from Barcin in the Marmara region of North- 
west (NW) Anatolia (2)] as surrogates for 
five ancestry sources. Within this framework, 
the highest proportion of Anatolian Neolithic- 
related ancestry is observed in Neolithic 
Anatolian populations as well as the early farm- 
ers of Cyprus. The Balkan hunter-gatherer- 
related affinity in the Pre-Pottery population 
at Boncuklu and the Epipaleolithic one from 
Pinarbasi—both of which predate the Pottery 
Neolithic from Barcin by thousands of years— 
does not indicate that these older individuals 
were admixed with European hunter-gatherers. 
Rather, it reflects the fact that in comparison 
to the Barcin population, both Pinarbasi and 
Boncuklu were “less Levantine” (Fig. 2A), a 
finding that is consistent with the Levantine 
influx into the Pottery Neolithic populations that 
is revealed by the analysis that follows. A con- 
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olithic, Pre-Pottery Neolithic, and Pottery Neolithic 
populations from (A) shown on the map of West Asia. 


trasting case is that of the Natufians, who are 
inferred to be “more Levantine” (along the 
Anatolian/Levantine cline) and are unsurprisingly 
inferred to derive all of their ancestry from the 
Levant Pre-Pottery Neolithic source; this of 
course does not mean that the earlier Natufians 
are descended from the Pre-Pottery Neolithic 
farmers that followed them but rather that both 
share ancestry (in reality, from the Natufians 
to the Pre-Pottery Neolithic farmers), which 
is modeled in this way within the limitations 
of the five-way model. Similarly, the Ganj Dareh 
population (most extreme) of the inland group 
derives all its ancestry from the Caucasus hunter- 
gatherer source used in the five-way model, and 
Caucasus hunter-gatherer-related ancestry lev- 
els are high in all inland populations, that is, of 
the Northern Zagros, Armenia, and Azerbaijan, 
as well as those of North Mesopotamia. 

The high Anatolian-related ancestry in 
Cyprus revealed by this model (Fig. 2) and 
subsequent analyses (Fig. 3) sheds light on 
debates about the origins of the people who 
spread Pre-Pottery Neolithic culture to Cyprus. 
Parallels in subsistence, technology, settlement 
organization, and ideological indicators (15) 
suggest close contacts between Pre-Pottery 
Neolithic B people in Cyprus and on the main- 
land (13), but the geographic source of the 
Cypriot Pre-Pottery Neolithic populations has 
been unclear, with many possible points of 
origin (20). An inland Middle Euphrates source 
has been suggested on the basis of architec- 
tural and artifactual similarities (74, 27). How- 
ever, the faunal record at Cypriot Pre-Pottery 
Neolithic B sites and the use of Anatolian 
obsidian as raw material suggest linkages 
with Central and Southern Anatolia (75), and 
the genetic data increase the weight of evidence 
in favor of this scenario of a primary source 
in Anatolia. 

The two individuals from Armenia, from the 
sites of Aknashen (~5900 BCE) and Masis Blur 
(~5600 BCE) differ in being more Caucasus- 
and Anatolia/Levant-like, respectively, despite 


being buried just ~200 km and a few centuries 
apart; thus, Neolithic people of Armenia were 
not homogeneous but instead exhibited var- 
iation that also encompassed two ~5700 to 
5400 BCE individuals buried in neighboring 
Azerbaijan (7), who are intermediate between 
the two from Armenia in both PCA and the 
five-way model. But in comparison to the in- 
dividuals from Mesopotamia to the south, the 
individuals from Armenia and Azerbaijan had 
more Anatolian Neolithic admixture (visible 
in both PCA and the five-way model). Con- 
versely, some Neolithic Anatolian populations 
from Central Anatolia had Caucasus hunter- 
gatherer-related admixture, more than Pinarbas1 
and the NW Anatolian source population, where 
such ancestry is not evident, but less than the 
proportion inferred for the individual from 
Mardin from Southeast Anatolia, which be- 
longed (together with its neighbors at Nemrik 9 
in Northern Iraq) to the inland group character- 
ized by high Caucasus hunter-gatherer-related 
ancestry. These observations form a consistent 
picture of a Neolithic continuum characterized 
by the Anatolian/Levantine cline on one end 
and inland influence related to the Zagros- 
Caucasus set of populations, with the geo- 
graphically intermediate individuals from 
Mesopotamia, Armenia, and Azerbaijan occu- 
pying genetically intermediate positions. 

To avoid publication-order bias, that is, the 
tendency to update published models to ac- 
commodate new data rather than always in- 
ferring models taking all samples equally into 
account, we coanalyzed new data from the 
Neolithic together with previously published 
data to arrive at a model of Neolithic origins 
that can account for patterns of genetic var- 
iation in Neolithic West Asia as a whole (22). 
The Neolithic continuum emerges from this 
analysis too, as all Neolithic populations under 
study can be modeled as mixtures of three 
pre-Neolithic sources representing Anatolian 
(Pinarbas1), Levantine (Natufian), and inland 
sources (either Caucasus hunter-gatherer, as 
in Fig. 3A, or Ganj Dareh, as in Fig. 3B); the 
two inland sources are not independent but to 
a first degree of approximation represent the 
same source of ancestry (Fig. 3C). When we 
attempt to model Neolithic populations using 
either Caucasus hunter-gatherers or Ganj Dareh 
as a source population and the other as an 
outgroup, we obtain good model fits for most 
populations (further suggesting that neither 
population is a better source than the other), 
except (i) for the high Caucasus hunter-gatherer 
ancestry individual from Aknashen, where the 
Caucasus hunter-gatherer model is not rejected 
(P = 0.46) while the Ganj Dareh one is (P < 
0.001); (ii) the Azerbaijan and Mesopotamian 
Neolithic for which both models are rejected 
(P < 0.01); and (iii) the Barcin Neolithic for 
which the Ganj Dareh model is narrowly not 
rejected at the P = 0.01 level (P = 0.0142), while 
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Fig. 3. The Neolithic continuum. (A) Three-way model of Neolithic admixture with 
Caucasus hunter-gatherer (CHG) (10) as a source. (B) Three-way model of Neolithic 
admixture with Ganj Dareh (1) as a source. (©) Caucasus hunter-gatherer and Ganj 
Dareh admixture proportions from (A) and (B) are strongly correlated [coefficient of 
determination (R) = 0.91; P<1x107]. (D) We also modeled Neolithic populations with 
local, Anatolian [Pinarbas! (6)] and Eastern, Mesopotamian Pre-Pottery Neolithic 
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(PPN), proximal sources. Both Pre-Pottery Neolithic populations from Anatolia [from 
Boncuklu (6) and Asikli Hoytik (8)] have no significant evidence for extra Levantine 
ancestry. However, all three Pottery Neolithic ones [from Barcin in NW Anatolia and 
Tepecik-Ciftlik (5) and Catalhoytik (8) in Central Anatolia] have significant additional 


Levantine ancestry. (Ancestry proportions for some groups are nonsignificantly 


negative, reflecting statistical uncertainty in the estimates.) 
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Fig. 4. The dilution of Neolithic ancestry in the Levant. The trajectory of West Asian components of ancestry in the Levant. (A) Caucasus hunter-gatherer 


ancestry increased over time, beginning in the Chalco 


ithic period and continuing into the Bronze Age, while t 


he local Levantine ancestry (B) was diluted during the 


past 10,000 years. (€) Anatolian ancestry, like Caucasus hunter-gatherer ancestry, also increased by the Chalcolithic period (26), undergoing fluctuations thereafter. 


the Caucasus hunter-gatherer one is rejected 
(P = 0.001). These results tentatively suggest 
that Caucasus hunter-gatherer and Ganj Dareh 
Neolithic are interchangeable for the purposes 
of quantifying the amount of inland admixture, 
although some populations may have a clearer 
connection with one or the other (e.g., the Neo- 
lithic of Armenia with the hunter-gatherers of 
the South Caucasus rather than Iran, and the 
geographically intermediate Azerbaijan and 
Mesopotamia with both). 

The fact that regardless of the chosen sources, 
none of the Neolithic populations of West Asia 
were simple descendants of their pre-Neolithic 
antecedents when we had the data to test this 
(in which case some of them would occupy the 
corner positions of Fig. 3, A and B) suggests that 
some history of admixture may have led to their 
appearance; the details of this process could 
be elucidated by examining even older pop- 
ulations from across West Asia. When pre- 
Neolithic antecedents are not available, as is 
the case for North Mesopotamia, it remains 
an open question whether the local hunter- 
gatherers were genetically continuous with 
the first farmers of the region, or if there was a 
history of admixture across the Neolithic tran- 
sition there as well. Notably, this highlights 
that intermediate populations of the ternary 
plots of Fig. 3 need not have come about by 
admixture from the corner populations used 
to model them; alternatively, they could be 
drawn toward the middle by unsampled pre- 
Neolithic populations of West Asia, for ex- 
ample, hunter-gatherers of the Tigris and 
Euphrates regions predating the Pre-Pottery 
Neolithic farmers studied here. 

When we attempted to model Neolithic pop- 
ulations as mixtures of each other, we observed 
that at least in Anatolia (Fig. 3D), where most 
of the data are from and from which both Pre- 
Pottery and Pottery Neolithic populations have 
been published, an interesting distinction be- 
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came clear. Pre-Pottery Neolithic populations 
from Central Anatolia can be modeled as mix- 
tures of a group related to the local Pmarbas1 
Epipaleolithic with variable (~30 to 70%) 
Mesopotamian admixture, suggesting that 
Pre-Pottery cultures of Anatolia may have been 
formed with the contribution of both local 
hunter-gatherers and migrants from the east, 
where agriculture first appeared. But we can- 
not model the Pottery Neolithic Anatolians 
with just these two sources and instead re- 
quire an extra ~6 to 23% Levantine Neolithic 
admixture. The source of this admixture is 
unclear; it need not have come from the 
Southern Levant (Jordan) from which the 
Levantine Neolithic individuals were sampled 
and may instead represent a geographically 
closer source for which there is no available 
genome-wide data, for example, from Syria, 
where early Pottery Neolithic cultures such 
as the Halafian flourished and for which the 
available polymerase chain reaction-based mito- 
chondrial DNA data cannot distinguish alter- 
native scenarios (23). 

We caution that while our results point to 
migration from, and admixture with, Mesopo- 
tamian and Levantine populations, when we 
use the term “migration,” we are not claiming 
that we have detected a “migratory movement,” 
that is, a planned translocation of a large num- 
ber of people over a long distance within the 
space of years [for discussion of nuances in 
the use of the term migration, see (24)]. Mi- 
gration in the sense that we use it may either 
be intentional or not; it may involve few or 
many individuals; and it may either be rapid 
or continue across many generations. Some 
such migration and admixture must have 
taken place, as indicated by the genetic data, 
but its causes, routes, and fine-grained tem- 
porality remain to be clarified. 

A further caveat is that the Levantine influ- 
ence detected in Anatolian Pottery Neolithic 


populations need not have been the result of 
unidirectional migration into Anatolia but 
may also have come about if Anatolia and 
the Levant became part of a mating network 
spanning both regions. Data from Pottery 
Neolithic cultures of the Levant are needed to 
test this hypothesis and to determine whether 
there was movement of mating partners in 
both directions. 

Levantine ancestry may have flourished 
during the Neolithic, and yet its later trajec- 
tory in the Levant itself (including individuals 
from Jordan, Israel, Syria, and Lebanon) ex- 
hibits a decrease of ~8% per millennium from 
the Pre-Pottery Neolithic down to the Medieval 
period, largely replaced by Caucasus- and 
Anatolian-related ancestry from the north and 
west (Fig. 4). This persistent and sustained 
trend after the formation of the Neolithic West 
Asian populations studied here reminds us 
that large-scale admixture continued in ensu- 
ing millennia. Despite the major decline in the 
contribution of Levantine Neolithic farmers to 
peoples in the region where they originated, 
this key ancestry source made a vital contri- 
bution to peoples of later periods, continuing 
until the present and weaving, through migra- 
tions and mixtures within and beyond the 
Southern Arc (12, 25), the tapestry of ancestry 
of all those that followed them. 
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NANOPHYSICS 


Tunable light-induced dipole-dipole interaction 
between optically levitated nanoparticles 


Jakob Rieser’, Mario A. Ciampini’, Henning Rudolph’, Nikolai Kiesel’, Klaus Hornberger’, 
Benjamin A. Stickler?*, Markus Aspelmeyer*’, Uros Deli¢’* 


Arrays of optically trapped nanoparticles have emerged as a platform for the study of complex 
nonequilibrium phenomena. Analogous to atomic many-body systems, one of the crucial ingredients is 
the ability to precisely control the interactions between particles. However, the optical interactions 
studied thus far only provide conservative optical binding forces of limited tunability. In this work, we 
exploit the phase coherence between the optical fields that drive the light-induced dipole-dipole 
interaction to couple two nanoparticles. In addition, we effectively switch off the optical interaction and 
observe electrostatic coupling between charged particles. Our results provide a route to developing 
fully programmable many-body systems of interacting nanoparticles with tunable nonreciprocal 
interactions, which are instrumental for exploring entanglement and topological phases in arrays of 


levitated nanoparticles. 


hen a dielectric subwavelength parti- 

cle is illuminated by laser light, the 

particle is polarized in phase with 

the incoming electromagnetic wave. 

The induced dipole makes the parti- 
cle a high-field seeker, which enables optical 
trapping in the intensity maximum of focused 
lasers (1). Additionally, the dipole radiation 
field acquires the optical phase of the trapping 
field. This process, called coherent scattering, 
has been used in combination with an optical 
cavity to cool the motion of atoms and polar- 
izable nanoparticles in far-detuned traps (2-5). 
More recently, it has been applied to achieve 
motional ground-state cooling of single silica 
nanoparticles in an optical cavity (6) and with 
real-time feedback (7, 8). 

Simultaneous trapping of more than one 
particle in a single optical potential allows for 
the creation of a self-organized structure of 
particles that interact through the scattered 
light (9). The particles assume steady-state 
positions at locations where the constructive 
interference of scattered fields is maximized, 
thereby minimizing the total energy. This 
optical interaction is fundamentally conser- 
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vative and reciprocal, giving rise to a spring- 
type interaction called optical binding (9, 10). 
Optical binding between dielectric objects 
has been realized for microparticles (where 
the radius is comparable to or larger than the 
wavelength) in many experiments (70-18). In 
this regime, the effect is well described by 
Mie scattering theory (9), in which the light 
scattered from one of these particles is not 
spatially coherent over the extension of a 
neighboring particle. For the case of nano- 
scale objects, optical binding has been dem- 
onstrated with metal particles in liquid, where 
plasmon resonances enhance the interaction 
(19-21). However, making full use of the op- 
portunities provided by optically trapped 
nanoparticle arrays for investigating complex 
nonequilibrium phenomena requires controlla- 
ble interactions beyond the current framework 
of optical binding (22-24). The tools presented 
in this article pave the way to using the tech- 
nology of atomic physics (25, 26) for the gen- 
eration and observation of quantum correlations 
and topological phases in a fully programmable 
mechanical array (27, 28). 

In contrast with previous experiments, our 
work shows fully tunable and nonrecip- 
rocal optical interactions between two silica 
nanoparticles—with radius (r = 105 + 3 nm) 
appreciably smaller than the wavelength (A = 
1064 nm)—that are levitated in two distinct, 
phase-coherent optical traps at a variable trap 
separation dy. Each particle behaves as an 
induced dipole driven by the total optical 
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Fig. 1. Experimental setup. (A) Two laser beams are diffracted by the SLM and focused with the microscope 


objective to create two distinct optical tweezers. The o 


ptical traps are in the vacuum chamber at a pressure 


of ~1 mbar. The light is collected after the focus and used for detection of mechanical modes. We drive the 
electrodes with a voltage V to calibrate the number of charges. (B) Camera image of two nanoparticles 
(radius r = 105 nm) trapped in two optical traps at a distance do ~ 10 um. (C) Side view (above) and top view 
(below) of the trap foci. Two parallel laser beams are used to trap two nanoparticles at a distance do. Intrinsic 
mechanical frequencies along the z axis Q) e¢ ,/P(1+ n) and Q2°¢ ,/P(1— n) are controlled by a single 
parameter », Polarization is set along the y axis to maximize dipole radiation along the x axis. We set the 
optical phases o, and o with the SLM. (D) Our system can be simplified to two harmonic oscillators with 


frequencies Q,> that are coupled through dipole-dipo 
(E) In case of reciprocal coupling (g = 0), the normal 


e interaction with nonreciprocal coupling rates g + g. 
modes of the system, the COM mode z, and the 


breathing mode z_, are nondegenerate with frequencies Q, = Q and Q_ = Q + 2g, respectively. We observe 


the normal modes at frequencies Q, /2x = 51 kHz and 


Q_ /2nx = 63 kHz in the power spectral density (PSD) 


of the detector signal, which we fit to determine the mechanical frequencies. a.u., arbitrary units. 


field, which is a sum of the trapping field, 
and the coherently scattered light from the 
other particle. The interference between these 
two fields gives rise to the interaction be- 
tween the particles and affects their motion 
in all three dimensions. The total light-induced 
interaction is a combination of a conservative 
gradient force and a nonconservative radiation 
pressure force, in analogy to the forces acting 
on a single nanoparticle in an optical trap 
(1). The relevant contributions to the optical 
interparticle forces oscillate periodically and 
decay with the interparticle distance d as (29) 
F,.% sin(kd ¥ Ao)[+kn + (k — 1/zp)e,|/kd 
in the far field (kd > 1, k = 2n/A). Here, Ao 
denotes the optical phase difference between 
the trapping lasers at the particle position, n is 
the unit vector pointing from particle 1 to 
particle 2, and Zp is the Rayleigh length. The 
interaction between two equally sized par- 
ticles is thus fundamentally nonreciprocal 
(F, « —F,), which has not been explored in 
previous theoretical and experimental optical 
binding studies (9, 14, 16). 

We obtain the linear coupling between the 
particles from expanding the optical forces in 
terms of the relative motion along the trap 
axes (@, y, and 2). The particle motion along 
the x direction modifies the distance to d = 
dy + & — &g, where the primary contribution 
to the coupling follows from the spatial de- 


local trapping and scattered fields. The cou- 
pling rate thus depends on the distance as 
cc(kdp) +, which has been confirmed in sev- 
eral studies of the lateral optical binding with 
microparticles (10, 13, 16, 21, 29). On the other 
hand and in contrast with previous optical 
binding studies, the dominant contribution to 
the coupling along the z direction stems from 
the phase dependence of the interference as the 
motions are encoded in the optical phase 
Ao = Ady + (& — 1/2R)(2 — 82) (4), where Ady 
is the optical phase difference between the 
trapping lasers in the focal plane. This pre- 
viously unexplored coupling mechanism results 
in a long-range coupling rate that depends on 
the distance as (do) | as well. The ratio of 
the coupling rates along the z and a directions 
depends on the inverse ratio of the mechanical 
frequencies (~4), thus making it compelling 
to explore interactions along the zg direction. 
Altogether, attaining precise control over the 
optical phase difference Ad allows us to realize 
nonreciprocal and ultrastrong light-induced 
dipole-dipole interaction between nanoscale 
dielectric objects for the first time. 

In the experiment, the phase-coherent trap- 
ping lasers are generated in the first-order 
diffraction of a spatial light modulator (SLM). 
The lasers are focused by a microscope ob- 
jective into two independent traps (Fig. 1, A 
and B). The total trapping power of 2P ~ 800 mW 


pendence kd of the interference between the 
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is split between the two traps as P;» = P+ n), 


with the control variable n, which allows us to 
modify the mechanical frequencies along the 
gaxis as Qi» \/P(1 + n) (Fig. 1C). We control 
n, the optical phases 6, » at each trapping site, 
and the trap separation dp (distance between 
the trap foci along the 2 axis) with the SLM. 
We set the laser polarization along the y axis 
to maximize the dipole radiation along the x 
axis and hence the interaction strength. Each 
particle is randomly charged; therefore, we 
calibrate their absolute charges by applying 
an AC voltage to two electrodes placed along 
the x axis and select particles based on the 
desired charge (29). We monitor the particle 
motion with homodyne detection of the light 
transmitted from the optical traps. 

The linearized dynamics with particle center- 
of-mass (COM) positions 2,5 follow as (29) 


més, + mys = —(mOQF + hy + he)2, + (ka + hea) z9 
Mo + MyZ. = — (MOS + ky — ky) 2 + (ky — ko) z 
(1) 


where m is the mass of each particle, and the 
displacement resulting from the homogeneous 
part of the forces has been absorbed in 2,5. The 
spring constant k,; = Gcos(kdo)cos(Ado) /kdo 
describes the conservative part of the optical 
forces (tunable optical binding), whereas ky = 
Gsin(kdo)sin(Ady)/Kd describes a noncon- 
servative interaction, as indicated by a change 
of sign between the equations. The non- 
reciprocity of the interaction is maximized by 
Abo = 1/2 +n, where neZ. The constant 
Go? \/P;P, is a positive function of the trap 
powers P,» and the particle polarizability a. 
The scaling of G with a reflects the nature of 
the dipole-dipole interaction. At the pressures 
in our experiment, mechanical damping y is 
dominated by the collisions with the sur- 
rounding gas. For weak coupling between 
the particles (ia, ky « MQ%Zy ), Eq. Lyields the 
eigenfrequencies of the coupled system 
Q.(Mm) = Q+ gts? — 8’, where we define 
the conservative and nonconservative cou- 
pling rates as g = k,/2mQ and g = ky /2mQ, 
respectively. The control parameter nn» = 
—kz/mQ? defines the value at which the 
frequency splitting Q, —- Q_ is minimal and 
Q is the intrinsic mechanical frequency in 
the absence of interactions at n = 0. There- 
fore, at the avoided crossing, our system is 
described as two harmonic oscillators with 
frequencies Q,5 mutually coupled with non- 
reciprocal coupling rates g + g (Fig. 1D), which 
can be fully controlled by the distance dy and 
the optical phase difference Ado. In the case of 
purely conservative interaction (g = 0) and for 
equal mechanical frequencies, the normal 
modes of the system become the COM mode 
&, =2,+ 2, and the breathing mode z_ = 2; - 29. 
Only the breathing mode is affected by the 
interaction, such that its eigenfrequency 
shifts to Q_ = Q +2g, whereas the COM mode 
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Fig. 2. Avoided crossing between the breathing 
and the COM mode. We plot a spectrogram of the 
relevant frequency region as a function of the 
control parameter n. Attractive (g > 0) and repulsive 
(g < 0) conservative dipole-dipole interactions are 
observed at the trap separation of do = 3.15 um and 
for the optical phase differences Adg = 0 (above) 
and Ady = x (below), respectively. The COM mode 
is always at the frequency Q, /2x = 50 kHz, whereas 
the breathing mode Q_ has a higher or lower 
frequency by 2g/2n = 8 kHz for Ady = 0 or Ado = 2, 
respectively. Black dashed lines are fits to experi- 
mental data. The avoided crossing is shifted 

from 1 = 0 for Ady = 1 because of the interference 
of the trapping lasers in the focal plane. The 
amplitude of the detected motion is not constant 
owing to variations in detection sensitivity during 
the measurement. 


eigenfrequency remains unchanged at Q, = Q 
(normal mode splitting). In Fig. IE, we show 
the normal modes at Q,/2n~51kHz and 
Q_ /2n = 63 kHz in the power spectral density 
of the detector signal, where the mechanical 
frequency is Q/2n ~ 51 kHz when the interac- 
tion is switched off. 

To obtain the coupling rate g, we measure 
the eigenfrequencies as a function of n (Fig. 2). 
We set the trap separation to dp ~ 3.15 um and 
the optical phase difference to either Ady = 0 
(above) or Ad) = x (below), such that the 
interaction is purely conservative but of either 
positive (attractive) or negative (repulsive) na- 
ture. The spectrogram exhibits an avoided 
crossing between the normal modes, which 
typically occurs for equal intrinsic mechanical 
frequencies (= 0), but only if |g| > y/2. All 
measurements are conducted at pressures of 
~1.5 mbar, thus the avoided crossing is ob- 
servable for coupling rates larger than y/ 
2n ~ 1.5 kHz. From here on, the coupling rate 
is expressed in units of the modified mechan- 
ical frequency Q’ = Q + g as the ratio g/’ is 
independent of the optical power. We observe 
a frequency splitting of ~+8 kHz, which corre- 
sponds to a coupling of g/Q’ = +(0.09 + 0.01). 
Other coupling mechanisms can be neglected 
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in our investigations of the light-induced 
dipole-dipole interaction. First, we select 
particles with few charges, such that the ad- 
ditional coupling as a result of the electro- 
static interaction was smaller than |g¢|/Q' = 
(1.4.40.6) x 10~? (29). Furthermore, interpar- 
ticle coupling resulting from the ambient gas 
or liquid (aero- or hydrodynamic coupling) 
can be dominant for large objects and small 
distances; however, in our experiment it is 
negligible because the ratio of the particle 
radius to the trap separation is small (r/2d) < 
0.05) (16, 18). 

In our experiment, we achieve full control 
over the conservative and nonconservative 
coupling rates. Because the interaction arises 
from the interference between the trapping 
and scattered fields, we expect the coupling 
rate to oscillate with particle distance with a 
period of 4 and decay as dp 1 owing to the far- 
field nature of the dipole radiation at dis- 
tances dy > 4. We measure the normal mode 
splitting for trap separations dp in the range 
of 2.2 to 3.7 um and for phase differences 
Abdo = 0 (blue points) or Ady = a (orange 
points) to maximize the conservative inter- 
action (Fig. 3A). Good agreement is observed 
with our theoretical model and the measured 
coupling rates in both cases (29). At a distance 
of dp = 2.2 um and for Ady = 0, we observe the 
maximum coupling of g/Q’ = 0.186 + 0.017. 
The effect of the dominant nonconservative 
interaction is apparent for the trap separa- 
tion of dp = 2.2 um and the phase difference of 
Abo = 0.8n (Fig. 3B). The constant pumping 
of energy into the system as a result of the 
nonconservative forces increases the particle 
motional amplitude by an order of magni- 
tude. The eigenfrequencies are degenerate for 
ne(0, 0.07], from which we estimate the cou- 
plings of g/Q = 0 andg/Q ~ —0.017. To dem- 
onstrate its dependence on the optical phase 
difference Ady, we measure the normal mode 
splitting at a fixed separation of dp ~ 3.2 um 
(Fig. 3C). Our theoretical model of linear 
interactions (blue line) fails to fully predict 
the observed behavior because of several ef- 
fects. For example, the actual interparticle 
distance is different from the trap separation 
owing to the radiation pressure force of the 
dipole radiation, which provides a constant 
displacement force. Moreover, in the absence 
of an additional cooling mechanism, the par- 
ticles are able to explore nonlinear terms in 
the interaction Hamiltonian, which affects the 
eigenfrequencies and modifies the normal mode 
splitting. We observe a zero crossing because 
of the absence of the conservative forces at 
the phase of Ad, ~ 0.82, in agreement with 
our measurement in Fig. 3B. In future work, 
feedback cooling can be used to constrain the 
particle motion within linear dynamics. 

Rotating the trapping laser polarization by 
an angle © from the y axis provides for another 
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Fig. 3. Controllable dipole-dipole coupling. 

(A) Trap separation d is changed while keeping the 
optical phase difference fixed at Ady = 0 (blue 
circles) or Ady = m (orange circles). We observe a 
change of the coupling rate g with periodicity of 
~A and an envelope that drops off as dot. 
Amplitudes of blue and orange lines are calculated 
from the system parameters with the grayed 
region given by the standard deviation of the 
particle size. (B) At the trap separation do = 2.2 um, 
the particles experience a combination of the 
conservative and nonconservative forces. For the 
optical phase difference of Ado = 0.8z, only the 
nonconservative interaction is present (g = Q), and 
the eigenmodes are degenerate for 1 between 0 and 
0.07. The motion is strongly amplified in this 
region. The dashed lines are theory based on the 
estimated conservative and nonconservative cou- 
pling rates. (C) We set the trap separation at 

do = 3.2 um and tune the optical phase difference 
Ado from 0 to 2x and measure the mode splitting 
Q_ - Q,. Interaction is mostly conservative at 

Aog = nn(neZ) and can be explained by the 
linearized model (blue line). The nonconservative 
force contributes to the total force for all other 
values of Ady and is able to amplify the particle 
motion, thus modifying the normal mode splitting. 


way to control the dipole-dipole interaction 
(Fig. 4A). The magnitude of the dipole radia- 
tion along the x axis is smaller by a factor of 
cosO owing to the characteristic spatial profile 
of the dipole radiation in the far field. The 
interference of the dipole radiation with the 
trapping field is suppressed by a factor of 
cosO as a result of the scalar product of the 
two field components. Altogether, this yields 
a decrease of the coupling rate by cos”@ in the 
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Fig. 4. Turning off dipole-dipole interaction to detect electrostatic interaction. (A) For arbitrary 


polarization angle ©, the interference of electric fields 


is suppressed by cos*@. Two special cases of © = 0° 


and © = 90° are presented in green and orange rectangles, respectively. In the case of © = 90° there is no 
interference of the trapping and the scattered fields. (B) We measure the coupling rate resulting from the 
dipole-dipole interaction as a function of the polarization angle © between particles with an absolute number 
of 1 + 1 and O + 1 charges (circles). The interaction is maximal for the angle © = 0° (green circle). The 


avoided crossing is unresolved for coupling rates sma 


ller than the mechanical linewidth y (gray region). 


At the angle © = 90° the dipole-dipole interaction is suppressed (orange circle). We measure coupling 


resulting from the electrostatic interaction between particles with 96 + 21 and 


10 + 24 charges (orange 


triangle). (©) The avoided crossing is absent for horizontally polarized tweezers (© = 90°) because the far- 
field dipole-dipole interaction is strongly suppressed. (D) In the absence of optical interactions, we observe 


the avoided crossing for highly charged particles owin 


far-field approximation, which is confirmed 
in the measurement in Fig. 4B (circles and 
blue line). For the angle © = 90° the residual 
dipole-dipole interaction scales with (kd) ? 
because of the radial near-field component of 
the radiated field. We estimate the coupling 
of g/Q'=6 x 10~* at do ~ 34, which we are 
unable to detect in the current experiment as 
g/y < 10~? (Fig. 4C) (29). Suppression of the 
dipole-dipole interaction allows us to explore 
electrostatic interaction between strongly 
charged particles. We trap particles with 
absolute charges |q,|/e = 96 + 21and |qo|/e = 
110 + 24 and equal signs (qigo = |qigo|) and 
observe an avoided crossing as a result of 
electrostatic coupling (Fig. 4D). The order- 
ing of normal mode frequencies Q_ < Q, re- 
flects the repulsive interaction between the 
particles. The measured coupling gc/Q’ = 
—0.058 + 0.003 fits well to the expected 
&c/Q' = —0.047 + 0.015. Because the elec- 
trostatic coupling rate scales as <d, 3 we 
infer that electrostatic interaction between 
particles—each carrying a single charge—can be 
resolved at a distance of dp ~ 2 um and at pres- 
sures below 10° mbar. Altogether, our platform 
allows for exploring hybrid schemes with both 
dipole-dipole and electrostatic interactions. 
We have demonstrated the controllable 
(attractive and repulsive) light-induced dipole- 


dipole interaction between two silica nano- 
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ig to strong electrostatic interaction. 


particles levitated in distinct optical traps with 
coupling rates up to 20% of the mechanical 
frequency. These results expand the toolbox 
of optical binding by trapping in a phase- 
coherent optical tweezers array, which will 
enable further studies of optical interactions 
between Rayleigh particles (9) or atoms (30) 
at subwavelength distances. Furthermore, we 
control the nonreciprocal interactions between 
the particles by tuning conservative and non- 
conservative interactions. By effectively switch- 
ing off the optical interaction, we observe 
electrostatic interaction between two charged 
particles. The demonstrated strength and level 
of control of optical and electrostatic interac- 
tions in arrays of levitated solid-state objects, 
in combination with the previously realized 
quantum state preparation, provides a platform 
that may open up many research avenues in 
quantum physics. We foresee that this platform— 
with a possible addition of an optical cavity—can 
be used for quantum simulation with mechanical 
degrees of freedom (31-33), enhanced quantum 
sensing (34), collective effects (35), and phonon 
transport and thermalization (24). 
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METASURFACES 


Resonant metasurfaces for generating complex 


quantum states 


Tomas Santiago-Cruz*?, Sylvain D. Gennaro**, Oleg Mitrofanov>°, Sadhvikas Addamane*“, 


John Reno*“, Igal Brener***, Maria V. Chekhova'?* 


Quantum state engineering, the cornerstone of quantum photonic technologies, mainly relies on spontaneous 
parametric downconversion and four-wave mixing, where one or two pump photons spontaneously decay 
into a photon pair. Both of these nonlinear effects require momentum conservation for the participating 
photons, which strongly limits the versatility of the resulting quantum states. Nonlinear metasurfaces have 
subwavelength thickness and allow the relaxation of this constraint; when combined with resonances, 

they greatly expand the possibilities of quantum state engineering. Here, we generated entangled photons via 
spontaneous parametric downconversion in semiconductor metasurfaces with high-quality factor, quasi-bound 
state in the continuum resonances. By enhancing the quantum vacuum field, our metasurfaces boost the 
emission of nondegenerate entangled photons within multiple narrow resonance bands and over a wide 
spectral range. A single resonance or several resonances in the same sample, pumped at multiple wavelengths, 
can generate multifrequency quantum states, including cluster states. These features reveal metasurfaces 
as versatile sources of complex states for quantum information. 


ptical quantum state engineering mainly 
relies on nonlinear optical effects such as 
spontaneous parametric downconversion 
(SPDC) or spontaneous four-wave mixing 
(SFWM). These effects have been used to 
create a vast variety of photonic quantum states, 
including single (7) and entangled (2) photons, 
squeezed states (3), and cluster states (4-6). 
However, both SPDC and SFWM in conventional 
nonlinear crystals and waveguides require strict 
momentum conservation for the involved pho- 
tons, which strongly limits the versatility of the 
states they produce. The emergent concept of 
quantum optical metasurfaces (QOMs) helps 
to overcome this constraint. Metasurfaces (i.e., 
arrays of nanoresonators) feature unique abil- 
ities to manipulate and control the amplitude, 
phase, and polarization of light in the non- 
linear (7-9) and quantum (JO, 17) regimes using 
a single ultrathin device. In particular, all- 
dielectric metasurfaces made of materials 
with high second-order nonlinearities offer a 
potential route for on-chip quantum state gen- 
eration (12-14). As a result of the subwave- 
length thickness of metasurfaces, the momentum 
conservation (or phase-matching) requirement 
is relaxed (15), enabling multiple nonlinear 
processes to occur with comparable efficiencies 
(6). In addition, optical resonances in meta- 
surfaces and nanoresonators enhance the vac- 
uum field fluctuations through enhanced 
density of states at certain wavelengths, boost- 
ing the spontaneous emission of photons (7). 
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Vacuum fluctuation enhancement scales 
with the quality (Q) factor of the resonance. 
In metasurfaces, Q-factors are especially high 
for bound states in the continuum (BIC) res- 
onances (18, 19), which are discrete-energy 
modes whose energy levels overlap with a 
continuous spectrum of radiating modes (20). 
In symmetry-protected BIC metasurfaces, the 
outcoupling of radiation in the normal direc- 
tion is forbidden by symmetry (19). Hence, 
Q-factors of these modes may be infinite; 
in theory, they could infinitely enhance the 
spontaneous emission of photons and photon 
pairs. In practice, symmetry breaking (quasi- 
BICs) leads to finite enhancement (19), which 
can still be as high as 107 to 10*. 

Here, we report on the experimental gen- 
eration of tunable photon pairs via SPDC 
driven by high-Q quasi-BIC resonances in gal- 
lium arsenide (GaAs) QOMs. Our QOMs emit 
frequency-degenerate and nondegenerate 
narrowband photon pairs tunable over more 
than 100 nm by changing either the optical 
pump or the spectral location of the resonances 
without appreciable loss of efficiency. More- 
over, by judicious choice of resonance and 
pump wavelengths, we can simultaneously 
drive as many SPDC processes as necessary, 
obtaining frequency-multiplexed entangled 
photons and enabling multichannel heralding. 
Our work paves the way for building nanoscale 
sources of complex tunable entangled states 
for quantum networks. 

In SPDC, a pump photon of a higher fre- 
quency , downconverts in a second-order 
nonlinear material into a pair of signal and 
idler photons of lower frequencies, wo, and 
@;, following energy conservation (Fig. 1A). 
Unlike in bulk crystals, SPDC in subwave- 
length sources does not require longitudinal 
momentum conservation (15), leading to the 


broadband emission of photon pairs over a 
wide range of angles (21, 22). In optical nano- 
antennae and metasurfaces, however, the 
resonances select the range of wavelengths 
and wave vectors where the photon emission 
is enhanced (12, 14, 23). Therefore, with 
judicious choice and design of optical modes 
and resonances, metasurfaces can be used to 
generate tunable and unidirectional entangled 
photons. 

To demonstrate SPDC with quasi-BICs, we 
fabricated various arrays of broken-symmetry 
resonators arranged in a square lattice of dif- 
ferent periodicities and scaling by means 
of standard electron beam lithography and 
chlorine-based dry etching. Subsequent fab- 
rication steps included epoxy bonding and 
curing, substrate lapping and polishing, wet 
etching, and transferring the metasurface onto 
a transparent fused silica substrate. We chose 
GaAs because it possesses one of the highest 
second-order susceptibilities among tradi- 
tional materials, yx = 400 to 500 pm/V, for 
the range of pump wavelengths involved in 
this work; these susceptibilities exceed those 
of ferroelectric nonlinear materials such as 
lithium niobate by more than an order of mag- 
nitude (24). The structure of the metasurfaces 
is shown in Fig. 1B. 

The existence of symmetry-protected BICs 
can be explained through symmetry breaking 
and coupling between allowed and forbidden 
optical Mie modes (25), or using group theory 
arguments (26). A metasurface consisting of 
square nanoresonators obeys Cy and C, rota- 
tional symmetry; photons at the BIC frequency 
are trapped inside the resonators because of 
zero coupling to radiating modes. A small 
notch in the cube breaks the rotational sym- 
metry (Fig. 1C) and transforms these symmetry- 
protected BICs into quasi-BICs, which can 
outcouple to the far field. Spectrally, they ap- 
pear as narrow transmission peaks in the 
white-light far-field transmittance (Fig. 1D). 
The modes, labeled electric dipole (ED)-qBIC 
and magnetic dipole (MD)-qBIC, reach Q-factors 
of QbiceeD) = 330 and Qpicamp) = 1000, respec- 
tively. The ED-qBIC and MD-qBIC have different 
coupling efficiencies with respect to the incident 
beam polarization. For the lowest-order quasi- 
BICs, simulated electric field profiles resemble 
those of out-of-plane dipole modes (see Fig. 1D, 
insets, and fig. S1). 

By tuning the period of the array and the 
proportions of the resonators, the central wave- 
lengths of the quasi-BIC resonances can be 
tuned over a wide range. For emission normal 
to the QOM, the Q-factor is the highest be- 
cause most of the system symmetry is pre- 
served. Off-normal excitation leads to coupling 
to free-space radiation, reducing the Q-factor 
of each quasi-BIC resonance and lowering the 
field enhancement (fig. S2). Thus, we should 
also expect the emitted SPDC photon pairs to 
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Fig. 1. Spontaneous parametric 
downconversion (SPDC) 

using symmetry-protected 
quasi-BIC resonances in a semi- 
conductor metasurface. 

(A) Conceptual diagram of 
multiplexed entangled photon 
generation in a multiresonance 
semiconductor metasurface. 

(B) Scanning electron micrograph 
of the metasurface at an 
intermediate step of nanofabrica- 
n (see supplementary 
materials). (C) Structure of 

the broken-symmetry resonators 
(see table S1 for the various 


ict 
fe) 


metasurfaces’ dimensions). The 
addition of a small rectangle 
(pink) breaks the rotational sym- 
metries Co, C4 of the metasurface, 
turning the BIC into a quasi-BIC. 
(D) Simulated (top) and measured 
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and orthogonal to it (blue). Gray areas highlight the locations of the quasi-BIC resonances: ED-qBIC (A = 1446.9 nm) and MD-qBIC (A = 1511.8 nm). The insets 
show the normalized distribution and direction (black arrows) of electric field xy calculated at the center of the nanoresonator for both resonances. (E) Typical 
distribution of the photon arrival time difference for two detectors, demonstrating photon pair generation. Error bars denote the statistical uncertainty. 


be radiated almost unidirectionally along the 
metasurface normal, as observed for emission 
from quantum dots embedded inside a symmetry- 
protected, quasi-BIC metasurface (17). 

To demonstrate multiplexed entangled pho- 
ton generation via SPDC, we investigated three 
QOMs, labeled QOM-A, QOM-B, and QOM-C, 
with different resonator sizes and spacings, 
such that the ED-qBIC and MD-qBIC optical 
modes were resonant at different wavelengths 
(Fig. 2, A to C, upper panels; also see table S1 
and fig. S3). All metasurfaces were pumped 
with linearly polarized continuous-wave lasers 
of different wavelengths, focused into 140-1m 
spots. Two superconducting nanowire single- 
photon detectors placed at the two outputs of 
a fiber beamsplitter (fig. S4) enabled registra- 
tion of photon pairs as joint detection events. 
For all QOMs considered, we registered a high 
number of simultaneous photon detections— 
coincidences (Fig. 1E)—which indicates the 
presence of photon pairs. For further details, 
see the supplementary materials. Figures S5 and 
S6 also show the results of second-harmonic 
generation, which is mediated by the same 
nonlinear tensor as SPDC and was therefore 
helpful for optimizing the experiment. 

We then performed fiber-assisted spectros- 
copy of photon pairs with 3-nm resolution (see 
supplementary materials). The three correspond- 
ing SPDC spectra are shown in Fig. 2, A to C 
(bottom panels), each one below its white-light 
transmission spectrum. Thanks to the relaxed 
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momentum conservation, the pump wavelength 
Ap can be arbitrary, and thus we use different 
pumps and different QOMs to demonstrate 
various types of photon pair generation. 

For QOM-A, we used A, = 723.4 nm, such 
that the degenerate wavelength 22, overlapped 
with the ED-qBIC resonance wavelength. In 
this case, both SPDC photons were emitted 
within a single narrow peak, centered at the 
resonance wavelength 2A, = 1446.8 nm (Fig. 2A). 
The full width at half maximum (FWHM) of 
4.3 nm matched the linewidth of the ED-qBIC 
resonance. This corroborates previous obser- 
vations on QOMs (12) that the presence of an 
optical resonance enhances the quantum vac- 
uum field within the resonance’s bandwidth. 

However, very different spectra were ob- 
tained when we pumped QOM-B at 718.2 nm, 
such that the degenerate wavelength 22, was 
off-resonance from the ED-qBIC modes. Now, 
we found that the SPDC spectrum exhibited 
two narrow peaks, one centered at the ED-qBIC 
peak wavelength (A, = 1390.9 nm) and the other 
at a wavelength A, = 1485 nm (Fig. 2B). Here, the 
ED-qBIC resonance enhances the vacuum field 
at the signal wavelength, forcing the QOM 
to emit a photon at this wavelength simul- 
taneously with its partner at the idler wave- 
length, as dictated by energy conservation: 


“Gyo 


This indicates that narrowband, frequency- 
nondegenerate photon pairs (ubiquitous in 
photonic state engineering, i.e., for heralding) 
have been generated from a nanostructured 
photonic device such as a metasurface. 

The spectrum of the entangled photons gets 
richer when both ED-qBIC and MD-qBIC 
resonances are active in SPDC (Fig. 2C). We 
achieved this by pumping our third meta- 
surface QOM-C at 725.4 nm, with the pump 
radiation polarized at 1759, so that the MD-qBIC 
resonance was activated. We now observed four 
peaks, two of them corresponding to signal 
photons emitted at the ED-qBIC (1359.4 nm) 
and MD-qBIC (1429.4 nm) resonances, and 
the other two to their idler partners. We found 
that because the MD-qBIC resonance has a 
higher Q-factor than the ED-qBIC resonance, 
its emission FWHM is smaller, causing the 
signal and idler SPDC spectra to also be nar- 
rower. It might be possible to obtain higher 
rates (i.e., larger peak heights), but we note 
that the Q-factor of a resonance provides only 
the upper boundary for SPDC enhancement, 
which also depends on the vacuum field 
distribution in the nanoresonators and the 
nonlinear tensor symmetry and values. 

We also found that our QOMs can produce 
degenerate and nondegenerate photon pairs 
over a broad spectral range without consid- 
erable reduction of efficiency (Fig. 2D). As a 
consequence of the high-Q resonances, the ef- 
ficiency is at least three orders of magnitude 
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Fig. 2. SPDC spectra for the QOMs considered in this work. (A to C) Measured white-light transmission 
spectra (top) and SPDC spectra (bottom) for QOMs A, B, and C, respectively. The SPDC spectra show 

(A) degenerate photon pairs at wavelength 2A, = 1446.8 nm (vertical dashed line), which overlaps with the 
ED-qBIC resonance of QOM-A; (B) nondegenerate photon pairs, where only the signal photon is emitted at 
the ED-qBIC mode wavelength 4 = 1390.9 nm (purple vertical solid line) of QOM-B; (C) two types of 
nondegenerate photon pairs, where signal photons are emitted at wavelengths 4 = 1359.4 nm of the ED-qBIC 
resonance (orange vertical solid line) and 4 = 1429.4 nm of the MD-qBIC resonance (green vertical solid 
line) of QOM-C. Both ED- and MD-qBIC modes are active in SPDC because of the choice of pump polarization. 
Black dashed lines show the degenerate wavelength 2A,. The peak heights at signal and idler wavelengths are 
unequal because of the different detection efficiency of the two IR detectors and the asymmetric splitting 
ratio of the fiber splitter. (D) Coincidence rate, with 9.6 mW pumping at 725.4 nm, versus wavelength 
detuning AA = 2A, - Aep-qaic between the degenerate wavelength (24,) and the ED-qBIC resonance, ep-ggic, 
of five different QOMs. Error bars denote the statistical uncertainty. 


higher than in an unpatterned GaAs film of 
the same thickness (see supplementary text 
and fig. S7). 

To verify the nonclassicality of our photon 
pairs, we measured the pump-power depen- 
dence of the second-order cross-correlation 
function (CF) at zero time delay, defined as 
g?) (0) = (NSM) /(Ns)(Mi), where Ni are 
photon-number operators for the signal and 
idler modes. The cross-CF is measured as 
g®) (0) = Re /(RsRiTe), where Regi are the rates 
of the signal-idler coincidences, signal photon 
detections, and idler photon detections, re- 
spectively, and 7, is the coincidence resolution 
time (27). Because all three numbers R,..; scale 
linearly with the pump power, ge ) (0) has an 
inverse dependence on it, as indeed we ob- 
served (Fig. 3A). Although this dependence 
indicates photon pair detection, a formal proof 
of nonclassicality requires the violation of the 
Cauchy-Schwarz (CS) inequality (28), 


[s2@)|" <82(0)-820) —@) 


where g).(0) are second-order auto-CFs 
for signal and idler modes, respectively. For 
QOM-B, which generates signal and idler 
photons at two distinguishable wavelengths 
(1390.9 nm and 1485 nm), we measured the 
auto-CF's of modes s and i after 50-nm FWHM 
bandpass filters centered at 1400 nm (s) and 
1475 nm (i), respectively (Fig. 3, B and C). 
Single-count graphs for each measurement 
are shown in fig. S9. We obtained gi?) (0) = 
1.6+0.3 and g®) (0) = 1.2+0.2. The cross-CF, 
measured without bandpass filters (Fig. 3D), 
was g(0) = 10.5+1.1. Hence, the CS in- 
equality is violated by >50 standard deviations, 
(e?) (0)? = no+2, g@ (0) - g?’(0) = 1.9+0.3, 
revealing the nonclassical character of photon 
pairs. 


Fig. 3. Second-order A 
cross-correlation 2 
and autocorrelation 10°- 
functions. (A) Pump-power 


dependence of the second- 
order cross-correlation 
function at zero time delay 
measured for photon 

pairs emitted by QOM-A 


(points) and its theoretical a 
fit (line). For raw data, 
see fig. S8. (B and 

C) Second-order auto- 1 0° 4, 0 


correlation functions g2) and 0 30 60 90 120 


g) measured for QOM-B Pump power (mW) 
at the signal (1390.9 nm) 


and idler (1485 nm) wave- 


10 


| 
-5 0 5 
Arrival time difference (ns) 


lengths, respectively. (D) Second-order cross-correlation function g®) measured for QOM-B. Error bars are obtained from the statistical uncertainties via the 


propagation of errors. 


SCIENCE science.org 


26 AUGUST 2022 * VOL 377 ISSUE 6609 993 


RESEARCH | REPORTS 
A B 
\, —_ , 
o_e_® 
5 
A a 
JA, Mes @ 
Cc 
Fas 
A, O 
My Nees 1 r, Kos 2 
Nes 3 kg Nes 4 Ay 


| | | 
1360 1420 1480 1540 


Wavelength (nm) 


Fig. 4. Cluster state generation with QOMs. (A) Examples of cluster states: a linear three-qubit state 
(top), a Greenberger-Horne-Zeilinger state (middle), and a more general graph state (bottom). Links between 
the vertices are provided by different pumps. (B) SPDC spectrum illustrating how a linear three-qubit graph 
state of photons |Ares), |A1), |Az) can be generated from QOM-C. (C) Spatial multiplexing of four 
metasurfaces for generating the state of (A), bottom, using a single multifrequency pump beam. 


The presence of narrow optical resonances 
allows one to create, apart from entangled 
states, more complicated graph quantum states 
(Fig. 4A). In our setup (fig. S4), the state prep- 
aration relies on the fiber beamsplitter send- 
ing a photon to each detector; events where 
both photons arrive at the same detector are 
ignored. For a nondegenerate photon pair at 
wavelengths i,., (resonant) and A, (matching 
via energy conservation), after the beam- 
splitter, photon |A;<s) is path-entangled with 
photon |A,). For a second nondegenerate pair 
at wavelengths A,., and Ay, photon |A;es) is 
also path-entangled with photon |A,) after the 
beamsplitter. Note that this photon pair needs 
another pump for its generation. If photons 
|Aves) from each pair are indistinguishable, 
which requires two mutually coherent pumps, 
alinear graph state of three pairwise entangled 
qubits is created (Fig. 4A, top). This strategy for 
cluster state generation, called pairwise coupling 
(6), can be implemented using pump beams 
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at wavelengths of 725.4 nm and 718.2 nm and 
an ED-qBIC resonance at Ares = 1359.4 nm. 
Figure 4B shows the corresponding spectra 
with orange and blue colors, respectively, with 
only two entangled photon pairs (Ares) and |A,), 
|Aves) and |A9)) marked for simplicity. In our 
experiment, the pump beams are incoherent, 
but they still provide the desired spectrum. 
By adding a third pump, generating photon 
pairs at wavelengths A,,., and (3, a more com- 
plicated graph state can be created, called a 
Greenberger-Horne-Zeilinger state (5) (Fig. 4A, 
middle). 

The state becomes increasingly complex by 
adding multiple coherent pump beams at dif- 
ferent wavelengths—that is, using a frequency 
comb or a filtered supercontinuum as an ex- 
citation source. By appropriately matching the 
wavelength separation of the comb and the 
optical resonances of the QOMs, photons at 
multiple wavelengths can be entangled via 
pairwise couplings. With this approach, one 


could implement a scalable cluster state, needed 
for one-way quantum computation (5), as 
shown in Fig. 4A (bottom). Such methods of 
quantum state engineering are enabled by the 
use of our QOMs with relaxed phase match- 
ing and engineered high-Q resonances, and 
are impossible with bulk crystal or waveguide 
SPDC sources. Moreover, QOMs provide a 
unique way to spatially multiplex multiple 
metasurfaces within the area of a single pump 
beam (possibly multifrequency) by exciting 
resonances at different wavelengths and en- 
tangling multiple photon pairs across separate 
wavelengths with a single multifrequency pump 
beam, as shown in Fig. 4C. But even without 
links between different pairs, the existence of 
multifrequency or multispatial channels sug- 
gests a new architecture for heralding single 
photons in many different modes (29). 

The use of a highly nonlinear metasurface 
with narrowband resonances at arbitrary, dis- 
crete, and multiple wavelengths enables new 
opportunities for generating quantum states 
that have no counterpart when using tradi- 
tional nonlinear optical crystals or passive (30) 
or conventional Mie-type metasurfaces (12). 
First, we have shown enhancement in pair 
emission rates of at least three orders of mag- 
nitude relative to an unpatterned film of the 
same material and thickness. Second, we have 
demonstrated nonclassical correlations be- 
tween downconverted photons at broadly 
separated wavelengths, opening the possi- 
bility of photon heralding. Third, combining 
single or multiple coherent laser sources with 
two or more quasi-BIC resonances from a single 
or multi-patch metasurface, we can create pho- 
ton pairs at multiple frequencies with compa- 
rable efficiencies. Such a multitude of photon 
pairs can be used to create complex photon 
quantum states, including cluster states and 
multichannel single photons, that could facil- 
itate compact quantum information process- 
ing (4, 5, 29). Finally, achieving this level of 
quantum state engineering with a single nano- 
scale source can potentially lead to future 
miniaturization of photonic quantum process- 
ing, not possible today with other photonic 
platforms. 
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Massively degenerate coherent perfect absorber for 


arbitrary wavefronts 


Yevgeny Slobodkin't, Gil Weinberg't, Helmut Hérner“t, Kevin Pichler?, Stefan Rotter?*, Ori Katz'* 


One of the key insights of non-Hermitian photonics is that well-established concepts such as the laser 
can be operated in reverse to realize a coherent perfect absorber (CPA). Although conceptually 
appealing, such CPAs are limited so far to a single, judiciously shaped wavefront or mode. Here, 

we demonstrate how this limitation can be overcome by time-reversing a degenerate cavity laser 
based on a unique cavity that self-images any incident light field onto itself. Placing a weak, critically 
coupled absorber into this cavity, any incoming wavefront, even a complex and dynamically varying 
speckle pattern, is absorbed with close to perfect efficiency in a massively parallel interference process. 
These characteristics open up interesting new possibilities for applications in light harvesting, energy 


delivery, light control, and imaging. 


he absorption of light is a fundamental 

process in nature, physics, and engineer- 

ing that is central to many important 

tasks ranging from photosynthesis to the 

operation of solar panels and detectors. 
Whereas light is readily absorbed by thick 
materials that we perceive as black, thin and 
weakly absorbing media are inherently far less 
efficient in capturing incoming radiation and 
converting it into heat or other forms of en- 
ergy. A well-known strategy to make even such 
weakly dissipative substances strongly absorb- 
ing is to embed them into a resonant structure 
(/, 2). At the so-called critical coupling condi- 
tion, where the coupling strength to such a 
resonator is exactly balanced with the internal 
dissipation, the incoming field gets perfectly 
absorbed with no energy being back-reflected 
from the resonator (3). However, this interfer- 
ometric enhancement of absorption places 
severe restrictions on the properties of the 
incoming field. For example, in the case of a 
single incoming channel (mode), the optical 
frequency needs to be precisely tuned to a 
resonator’s critically coupled resonance fre- 
quency (1). Generalizing the critical coupling 
condition to multichannel scattering prob- 
lems leads to the phenomenon of coherent 
perfect absorption, for which the incoming 
wavefront in all available scattering channels 
needs to be adjusted, in addition to the spec- 
tral tuning (4, 5). In other words, whether it is 
two laser beams impinging on an absorbing 
structure (6-8) or a complex microwave field 
hitting a disordered arrangement of obstacles 
(9), at the critical coupling condition, only a 
single, suitably adjusted wavefront (or spatial 
mode) gets coherently perfectly absorbed. Al- 
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though this required wavefront adjustment 
opens up the possibility of controlling the ab- 
sorption process interferometrically (5), it also 
comes with the limitation that, apart from the 
correctly matched input wavefront, all of the 
possibly many other modes are only weakly 
absorbed because of the different interference 
patterns that they create. To overcome this 
restriction (10), recent works have managed to 
merge two perfectly absorbed modes at a so- 
called exceptional point, resulting in chiral 
absorption (1J-13). 

Here, we demonstrate how to remove the 
limitation of the number of perfectly absorbed 
modes in a coherent perfect absorber (CPA) 
entirely. Our design principle for a correspond- 
ing multimode CPA is based on the insight that 
coherent perfect absorption formally cor- 
responds to the time-reverse of laser emission 
at the first lasing threshold (7, 14). To create 
a device that can perfectly absorb arbitrary 
combinations of incoming modes interfero- 
metrically, one is thus required to time-reverse 
a laser that emits all of these modes in parallel. 
Such a laser is known as a degenerate cavity 
laser (15-17) and is based on a cavity that 
self-images the field on either one of the two 
outer cavity mirrors onto itself after each cavity 
round trip. This can be realized in a straight- 
forward fashion by placing two lenses in an 
imaging telescope configuration inside the 
cavity, ensuring coherent perfect absorption 
of any combination of modes regardless of 
their relative phases. 

We illustrate the concept of a massively 
degenerate CPA (MAD-CPA) side by side with 
a conventional single-mode CPA (Fig. 1). The 
simplest conventional CPA is engineered to 
absorb a plane-wave input at normal incidence 
by placing a critically coupled absorber be- 
tween two plane mirrors. For such an input, 
all reflections from the multiple cavity round 
trips overlap and destructively interfere when 
the CPA condition is met. The total reflection 
is reduced to zero and all of the energy is 
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Fig. 1. MAD-CPA for arbitrary wavefronts: concept. (A) Conventional 
(single-mode) CPA composed of a weak absorber placed between two flat 
mirrors. Although perfect absorption is achieved for a normal-incident plane wave 
through destructive interference of reflections, any other incoming mode, such 
as the simple tilted beam shown, results in multiple reflections that cannot 
destructively interfere. (B) By contrast, the proposed massively degenerate 


absorbed. However, for any other input field 
that is incident at a different angle or in another 
mode, the reflected fields from the multiple 
round trips do not have a spatially identical 
distribution anymore; their destructive inter- 
ference is out of sync and perfect absorption 
cannot be achieved (Fig. 1A). To realize a CPA 
that can universally absorb any arbitrary, com- 
plex spatial mode, one must ensure that all 
of the resonant cavity reflections coincide and 
destructively interfere with the nonresonant 
reflection at the front cavity mirror. This con- 
dition is naturally fulfilled in a degenerate 
(self-imaging) cavity design (Fig. 1B), which 
forms the basis for the degenerate cavity lasers 
that have been studied extensively for their 
unique lasing properties (18-20). Self-imaging 
is maintained for any mode supported by 
the cavity optics, be it a plane wave at any 
angle or a highly complex field with a com- 
plicated wavefront or even spatially inco- 
herent fields. 

In our experiment, we used a degenerate 
linear cavity composed of a lens-based telescope 
placed inside a cavity having a total length of 
4 -f (Fig. 2A), with fbeing the focal length of 
each lens (27). This resonant cavity featured 
a partially reflecting mirror at the front (with 
a reflectivity of R; = 70%) and a nearly per- 
fectly reflecting mirror at the back (with R, = 
99.90%). A weakly absorbing 0.6-mm-thick 
color-glass absorber with a single-pass trans- 
mission of T,,), = 85.2% was placed next to the 
front cavity mirror. The CPA conditions were 
met simultaneously for all input modes (even 
for the relatively thick absorber; see supplemen- 
tary text $3.1) when the cavity length was reso- 
nant with the laser wavelength. The coherent 
nature of absorption in the MAD-CPA allowed 
rapid control, including strong suppression 
of the absorption to values well below the 
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absorber’s bare absorption value, by simply 
tuning the cavity length by a fraction of a 
wavelength. We characterized the degenerate 
CPA by injecting complex input fields using 
a spatial light modulator (SLM) illuminated 
by a wavelength-stabilized helium-neon laser. 
For each injected complex field, we measured 
the spatial distribution of the reflected light 
by imaging the front cavity mirror on an 
sCMOS camera. The very weak light intensity 
transmitted through the back cavity mirror 
was also measured for validation purposes 
(fig. S4). 

With the MAD-CPA being the equivalent 
of a time-reversed degenerate laser at thresh- 
old, the conditions necessary for reaching de- 
generate CPA are the same as the threshold 
lasing conditions. First, in lasing, the round- 
trip power gain, G, should be equal to the 
round-trip losses, i.e., RyG = 1 (assuming Rp» = 1). 
In a CPA, this condition translates to the crit- 
ical coupling condition (3): Ri (T3,,) Tt =1 
Second, the cavity must be aligned for perfect 
self-imaging after one round-trip propagation. 
Third, the cavity length must be resonant with 
the input wavelength. As detailed in supple- 
mentary text $3, these conditions were met 
in our experiments by selecting a front cavity- 
mirror reflectivity that matched the absorption 
of the cavity absorber, carefully aligning the 
cavity optics, and tuning the cavity length to 
resonant maximal absorption (Fig. 2, C to E). 

To illustrate the versatility of our MAD-CPA 
setup, we injected into it a highly complex 
input field, in which >1000 modes coherently 
formed a speckled yin-yang symbol. Figure 2, 
B to E, presents the corresponding experimen- 
tal results for the back-reflected light field 
while tuning the cavity length. As a hallmark 
of the successful operation of our device, we 
observed that the reflected power of all the 


multimode CPA can perfectly absorb any complex incident wavefront. This is 
achieved by placing the weak absorber in a degenerate (self-imaging) cavity, 
realized here by a cavity with two lenses in a telescopic arrangement. In such a 
degenerate cavity, all reflections from the multiple cavity round trips show 
perfect destructive interference with the outer reflection from the front cavity 
mirror, R, (depicted for two incoming beams at different angles). 


input modes in the yin-yang input field was 
simultaneously minimized when the cavity 
length was tuned to meet the CPA condition 
(Fig. 2, C and E, and movie S1). 

The minimal experimental value reached 
for the reflectivity was ~5%, indicating that 
the weak 15% absorption of the intracavity 
absorber now featured >94% absorption for 
all incoming modes simultaneously. Moreover, 
each spatially localized mode (speckle) of the 
complex field was nearly perfectly absorbed, 
reaching a reflected power of <2% (Fig. 2, C 
and D, insets). When the cavity length was 
tuned by 4/4 away from perfect absorption 
condition, absorption was interferometrically 
suppressed to values well below the incoherent 
single-pass transmission of the absorber, provid- 
ing a modulation depth of >50 for each spatially 
localized mode. 

The small experimental deviations from per- 
fect absorption were mainly the result of weak 
spurious reflections from the cavity lenses’ 
antireflection coatings (R = 0.13% per sur- 
face; fig. S3) and by the very small aberrations 
(<A/100) of the self-imaging cavity optics and 
alignment, which led to slightly different cav- 
ity round-trip lengths for the different modes 
(Fig. 2C, inset, and fig. $1). Our numerical 
study showed that the small mismatch in the 
absorption value of the cavity absorber from 
the critical coupling condition was not a domi- 
nant source of deviation from perfect ab- 
sorption (fig. S2C), such that a commercially 
available absorber and a suitable front cavity 
mirror can be used. The influence of such ex- 
perimental imperfections is discussed and 
analyzed numerically in supplementary text 
$2, showing very good agreement with the 
experimental measurements (Fig. 2D). 

To further demonstrate the flexibility in- 
herent in the MAD-CPA design, we illustrated 
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Fig. 2. MAD-CPA setup and experimental results. (A) MAD-CPA setup in 
which complex input fields are injected into a degenerate cavity containing a 
critically coupled weak absorber using a computer-controlled SLM illuminated 
by a wavelength-stabilized laser. A camera measures the spatial intensity 
distribution of the reflected light. Coherent perfect absorption is achieved by 
tuning the cavity length to be resonant with the laser wavelength. (B to 

E) Results for a complex input field in the form of a speckled yin-yang symbol 
composed of >1000 modes. (B) Measured reflected intensity distribution when 
the cavity length is tuned for minimal absorption, coherently suppressing 
absorption [see (C)] and displaying the incident input field. (C) Total 
back-reflected power as a function of the cavity length (black trace). Red/green/ 


blue traces are the measured reflected power of the three individual localized 
modes 1/2/3 (speckle grains), respectively, marked by white squares in (B) 
and (E). Insets are magnified plots of the two minima in (C). (D) Same 
experimental measurements as in (C) for the total reflected power (black dots) 
and for the reflected power of one individual mode (red squares), together with 
the numerical prediction for the total reflected power for a 100-mode input 
(black lines) and for a single mode (red line) based on the experimental 
parameters with no free parameters (21). Insets are magnified plots of the two 
minima in (D). (E) Measured reflected intensity distribution when the cavity 
length is tuned for maximum absorption, showing near-perfect absorption of all 
input modes (see movie S1). Scale bars, 1 mm. 


its ability to absorb dynamic, rapidly chang- 
ing complex random light fields that were 
naturally generated by transmission through 
flexible multimode fibers (MMFs) and dynamic 
atmospheric aberrations (Fig. 3). We achieved 
this by replacing the SLM with a 40-cm-long 
MMF (21). The coherent light propagation 
through the MMF generated complex speckle 
fields caused by the dispersion of the fiber 
modes (Fig. 3C). To generate not only such a 
spatial complexity but also complex temporal 
dynamics, we rapidly shook the MMF using 
an external airflow. Moreover, before injecting 
the speckle fields into the cavity, we let them 
propagate through dynamic atmospheric aber- 
rations generated by a hot air stream from a 
heat gun (Fig. 3A). The results of these ex- 
periments are shown in Fig. 3 for different 
variations dynamics. In all cases, similar near- 
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perfect absorption values were achieved at 
CPA conditions (Fig. 3B), even for temporal 
variations that were faster than the camera 
exposure time, representing the absorption 
of spatially incoherent light fields, the time- 
reversed version of spatially incoherent lasing 
(19). Absorption will remain unchanged as 
long as the bandwidth of the input fields is 
narrower than the cavity absorption linewidth. 
This enables perfect absorption of dynamically 
varying fields as long as the correlation time, 
Tcorn Of the temporal dynamics is longer than 
the photon-decay time in the cavity. Figure 
3C displays individual camera frames from 
these dynamic absorption experiments (see 
also movies S2 to S5). 

The number of modes that can be supported 
by the MAD-CPA is limited by the space- 
bandwidth product of the self-imaging optics 


(19). Perfect absorption occurs as long as the 
incoming wavefronts are within the angular 
acceptance and the imaged area of the self- 
imaging cavity optics (19) and their spectral 
bandwidth is within the spectral linewidth of 
the degenerate cavity (supplementary text S3.2). 

It may be interesting to explore whether 
perfect absorption can also be useful for en- 
hancing the sensitivity in multipass micros- 
copy (22, 23) and if other degenerate cavity 
designs, such as those making use of a smaller 
number of optical elements, may offer prac- 
tical advantages (15). The ability of a MAD-CPA 
to enhance or suppress absorption with high 
contrast for thousands of modes simultane- 
ously also offers the interesting potential for 
optical modulation and switching, particularly 
when the goal is to absorb a large fraction of 
the incidence power in a weakly absorbing 
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Fig. 3. Coherent perfect absorption of rapidly varying complex fields. 

(A) Experimental setup in which dynamic complex fields are naturally generated by 
passing the illuminating laser field through a flexible MMF shaken by strong 
airflow. The light emanating from the MMF is further subjected to atmospheric 
aberrations generated by passing it through a hot airflow from a heat gun. 

(B) Total back-reflected power as a function of cavity length in four different 
experiments. Black trace, static fiber and no atmospheric turbulence; red trace, static 


sample such as a single or a few molecular layers. 
This flexible enhancement and suppression 
of absorption is hard to achieve by simple 
optical focusing on a given absorber. 

Other promising extensions of the MAD- 
CPA concept include its use as a highly mul- 
timode reflectionless scattering system (24). 
Placing an SLM inside the laser cavity (19) may 
also allow one to digitally control or compen- 
sate for absorption and aberrations. 

Although our work has focused on the spa- 
tial degrees of freedom of the incoming waves, 
it would be fascinating to achieve broadband 
absorption also in the spectral domain (25-27), 
a direction in which advances have recently 
been made using exceptional points and non- 
linear media (71-13, 28). 
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Reflected intensity frames 
AL=0A 


AL=0.2A : AL=0.35A : 


fiber with additional dynamic atmospheric aberrations; green trace, dynamic shaking of 
the fiber by airflow; blue trace, rapid shaking of the fiber, where the field dynamics 

are faster than the camera exposure time, effectively demonstrating absorption of 
spatially-incoherent fields. Inset is a magnification of the left minimum in (B). 

(C) Individual camera frames from the four experiments showing the reflected intensity 
at maximum absorption (AL = 0) and at two different cavity lengths (AL = 0.2 and 
AL = 0,354), demonstrating interferometric control of absorption. Scale bars, 100 um. 
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A cognitive process occurring during sleep is 
revealed by rapid eye movements 


Yuta Senzai?* and Massimo Scanziani?2* 


Since the discovery of rapid eye movement (REM) sleep, the nature of the eye movements that 
characterize this sleep phase has remained elusive. Do they reveal gaze shifts in the virtual environment 
of dreams or simply reflect random brainstem activity? We harnessed the head direction (HD) system 
of the mouse thalamus, a neuronal population whose activity reports, in awake mice, their actual HD 
as they explore their environment and, in sleeping mice, their virtual HD. We discovered that the 
direction and amplitude of rapid eye movements during REM sleep reveal the direction and amplitude 
of the ongoing changes in virtual HD. Thus, rapid eye movements disclose gaze shifts in the virtual world 
of REM sleep, thereby providing a window into the cognitive processes of the sleeping brain. 


EM sleep is a phase of sleep character- 

ized by rapid eye movements, from which 

it gets its acronym (J, 2). This phase of 

sleep is present across many vertebrates 

(3, 4) and has been associated with 
dreaming (5, 6). Indeed, when awakened during 
REM sleep, human subjects are more likely to 
report vivid dreams as compared to when they 
are awakened during other phases of sleep 
(7-10). This observation has led to the proposal 
that the nature of rapid eye movements during 
REM sleep may relate to the content of the 
ongoing dream (7-11). If so, rapid eye move- 
ments may represent a readout of some of the 
cognitive processes that occur in the sleeping 
brain. The verification of this hypothesis, how- 
ever, has led to contradictory results. Some 
initial studies indicated a correlation between 
the content of dreams reported by the subject 
and the direction or frequency of rapid eye 
movements recorded immediately before 
awakening (7-10). However, other studies did 
not reproduce these results (72, 13). Further- 
more, lifelong blind individuals who do not 
report visual experiences during dreams do 
have rapid eye movements during REM sleep 
(14). Thus, alternative hypotheses suggested 
that rapid eye movements may be unrelated 
to the mental processes that occur during REM 
sleep and simply reflect random brainstem 
activity (71, 15). 

Most of these studies, however, were based 
on the potentially inaccurate reporting of dreams 
by human subjects rather than on an objective 
measure of the cognitive processes that occur in 
the brain during REM sleep. Thus, we reasoned 
that by directly monitoring some of the cognitive 
processes that occur in the brain during REM 
sleep, we could gain insight into whether rapid 
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eye movements actually occur in coordination 
with such processes. 

We decided to use the head direction (HD) 
system of the mouse as the objective readout. 
HD cells are a population of neurons present, 
among other structures, in the anterodorsal 
nucleus of the thalamus (ADN). Their ensem- 
ble activity reports the direction of the head 
of the animal along the azimuth as it explores 
or navigates through its environment (16-18). 
During REM sleep, the population activity 
of HD cells is similar to that which occurs 
during actual navigation (19, 20), thus poten- 
tially representing an internal “virtual heading” 
of the sleeping animal. 

In awake, behaving mice exploring their 
environment, changes in HD are accompanied 
by fast, saccade-like movements of the eyes in 
the same direction (21, 22). Does a similar co- 
ordination exist in the sleeping mouse? Clearly, 
a sleeping mouse maintains a fixed heading. 
However, putative changes in the internal rep- 
resentation of heading of the sleeping mouse 
may be coordinated with the rapid eye move- 
ments that occur during REM sleep. By mon- 
itoring rapid eye movements, we may be able 
to reveal changes in internal heading that 
are occurring in the virtual world of the sleep- 
ing brain. 

We recorded from HD cells in the ADN of 
mice with extracellular linear probes while 
monitoring the movements of both eyes with 
head-mounted cameras (Fig. 1A and movies 
Sl and S2). Mice were free to explore an open- 
field arena, and their heading was monitored 
with a top-view camera (Fig. 1B). Mice were 
allowed to fall asleep, and their sleep phase 
was identified as REM or non-REM using 
standard electrophysiological parameters (ani- 
mals spent 40 to 52% of the session asleep, and 
10% of the sleeping period was identified as 
REM sleep; see table S1 and methods). To de- 
termine whether, during REM sleep, rapid eye 
movements are coordinated with the internal 
representation of heading, we proceeded 
through three steps: First, we established, in 


awake mice, the relationship between saccade- 
like eye movements and the internal represen- 
tation of the heading of the animal as decoded 
from the activity of HD cells (Fig. 1). Second, we 
determined the properties of rapid eye move- 
ments in mice during REM sleep (Fig. 2). 
Finally, we investigated the nature of the rela- 
tionship between rapid eye movements in 
REM sleep and the internal representation 
of heading (Figs. 3 and 4). 

We identified saccade-like eye movements 
in awake animals based on their fast dynamics 
(>400° per second). During the exploration of 
the open-field arena, saccade-like eye move- 
ments occurred mainly along the nasotemporal 
axis (fig. S1, A and B), the vast majority (94.1%) 
of these eye movements were conjugated— 
that is, both eyes moved in the same direction, 
either clockwise (CW) or counterclockwise 
(CCW)—and the amplitude of these eye move- 
ments was correlated across eyes (R = 0.89, P< 
10~°; fig. SIC). Here, we focus on conjugated 
saccade-like eye movements of at least 2° in 
amplitude and refer to them simply as sac- 
cades. CW and CCW saccades were coupled to 
head turns in the same direction, that is, with 
CW and CCW head turns along the azimuth, 
respectively, consistent with previous reports 
(21, 22) (fig. S1, E and F). To determine the 
relationship between the direction of sac- 
cades and the internal representation of 
heading, we analyzed the population activity 
of HD cells recorded in the ADN while the 
animal explored the open-field arena (Fig. 1C 
and fig. S2, A and B). We recorded between 30 
and 72 HD cells per animal (n = 6 mice; table 
Sl), which were defined as neurons whose 
activity was modulated by the heading of the 
animal along the azimuth (Fig. 1D and see 
methods). Using the heading of the animal 
and the simultaneously recorded HD cells, we 
trained an algorithm to report the heading 
solely based on the firing of HD cells (see 
methods). When tested on untrained periods 
of HD cell activity, the algorithm accurately 
decoded the heading of the mice as they ex- 
plored their environment (Fig. 1E) with an 
error of only 10.2° + 3.5° (m = 6 mice; See fig. 
$2, C and D, and methods). How well does 
the direction of a saccade match the internal 
representation of a head turn? We quantified 
the internal representation of a head turn as 
the difference between the heading decoded 
200 ms before and 200 ms after a saccade (Fig. 
1F). The vast majority of saccades (95.2 + 1.3%, 
n = 6 mice) occurred in the same direction 
(CW or CCW) as the ongoing head turns 
decoded from HD cell activity (Fig. 1, G and 
H). In awake animals, saccade direction and 
the internal representation of head turns are 
thus tightly coupled. 

Is this relationship maintained during REM 
sleep? To monitor rapid eye movements during 
REM sleep, we took advantage of the fact that 
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Fig. 1. Saccade direction pre- 
dicts internal representation of 
head turns. (A) Schematic of the 
experimental configuration illus- 
trating a chronic electrophysio- 
logical recording from the 

ADN while eye and head move- 
ments are monitored with cam- 
eras. (B) Schematic illustration of 
the open-field arena with a mouse 
carrying head-mounted eye 
cameras and silicon probes. 

The heading of the animal is 
monitored with a top-view cam- 
era. IR LED, infrared light- 
emitting diode. (©) An example 
frame from the top-view camera 
is shown at the top left, and a 
schematic illustration of the 
frame is shown at the top right. 
The top and bottom middle 
panels present example frames 
showing a CCW turn and a CW 
turn, respectively. Arrows indicate 
the heading of the animal in 
each frame. (D) The tuning 
curves of 11 example HD cells 
ecorded from the ADN of an 
awake mouse. The arrows on the 
ight indicate the preferred HD 
of each HD cell. Peak firing rates 
for HD cells are shown on the 
ight. (E) Shown at the top is a 
aster plot of the firing of the 

1 example HD cells shown in (D). 
iddle traces illustrate the 

actual heading of the animal 
(gray) and heading decoded from 
the population activity of HD cells 
(green). Triangles mark the tim- 
ing of example frames for the 
CCW turn (cyan) and CW turn 
(magenta) shown in (C). Bottom 
traces illustrate the horizontal 
position of the two eyes (red and 
blue). The vertical dashed lines 
indicate the onset of saccades. 
(F) Two snapshots of the right 
eye taken before and after 
de 


HD cell recording 
in ADN 


Time from saccade onset [ms] 


top-view 


a CCW saccade are shown at the top. The pupil is 
ineated with a black circle. A dashed circle in the right image labels the pupil's 


original position (N, nasal commissure; T, temporal commissure). The red trace 
illustrates the right eye's horizontal position in time. Saccade amplitude was 
defined as the change in the horizontal eye position upon a saccade. The green 
trace illustrates the decoded heading. The head-turn amplitude is the change 
in the decoded heading between 200 ms before and 200 ms after saccade onset. 
Note the CCW shift in decoded heading concomitant with the CCW saccade. 
(G) Summary scatter plot of the amplitude of conjugated saccades during the 


mice do not always close their eyes while asleep 
(Fig. 2, A and B, and movie S2). We quantified 
the direction and amplitude of rapid eye move- 
ments in the sleeping animal using the head- 
mounted cameras (23, 24). During REM sleep 
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(Fig. 2C and see methods), rapid eye movement 
occurred along the nasotemporal axis (moni- 
tored over a total of 68 min of REM sleep in six 
mice; fig. S3, A and B), and the direction of these 
movements, CW or CCW, was well correlated 
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exploration in the open field arena (n = 13,778 from six mice). The amplitude and 
sign of the decoded head turn during the saccade are color coded. Note that 
the direction of the saccades matches the direction of the decoded head turn. 
In this and the rest of the figures, the upper-right and lower-left quadrants of the 
scatter plots represent CW and CCW movements, respectively. (H) Correlation 
between the amplitude of conjugated saccades (averaged over both eyes) and 
the decoded head-turn amplitude. Saccade amplitude predicted the amplitude of 
decoded head turns for each individual mouse (gray lines) as well as for all 
mice (red line; vertical lines represent standard deviation). 


across both eyes (R = 0.58, P < 10°; Fig. 2, C to 
E), thus similar to saccades observed in awake 
animals (fig. S3C), albeit with smaller ampli- 
tudes (average amplitude: 5.1° + 2.5° for rapid 
eye movements and 9.7° + 3.5° for saccades) and 
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Fig. 2. Rapid eye movements during REM sleep. (A) Schematic of the 
experimental configuration illustrating a chronic electrophysiological recording 
from the ADN while rapid eye movements are monitored with cameras in a 
sleeping mouse. (B) Snapshots of the right (top) and left (bottom) eye before 
(left) and after (right) a CCW rapid eye movement. The pupil is delineated 
with a black circle. A dashed circle in the right image labels the pupil's original 
position. Note that both eyes move CCW. (C) An example spectrogram of the 
local field potential (LFP) recorded in the ADN during non-REM sleep (NREM), 
REM sleep, and wakefulness (Wake) is shown at the top. Electromyography 
(EMG) power is shown in the middle. The horizontal position of the two eyes (red 
and blue) is shown at the bottom. Note the increase in eye movements at the 


onset of REM sleep. a.u., arbitrary units. (D) The shaded time interval in 

(C) shown on an expanded time scale. Note that for most rapid eye 
movements, both eyes moved to the same direction. Dashed black and gray 
vertical lines indicate the onset of leading (not preceded by a rapid eye 
movement for at least 400 ms) and follower rapid eye movements. (E) Scatter 
plots of the amplitude of right versus left rapid eye movements during REM 
sleep for all mice (n = 6689 from six mice). Note that most data points are in 
the lower-left or upper-right quadrants, indicating CCW and CW movements 
of both eyes. (F) Distribution of intervals between rapid eye movements during 
REM sleep for all mice (n = 6689 from six mice). Leading eye movements 
are in black, and followers are in gray. 


at higher frequency (median interval: 267 ms for 
rapid eye movements and 521 ms for saccades; 
Fig. 2F and fig. SID). 

We decoded the internal representation of 
heading during REM sleep, that is, the virtual 
heading, by applying the algorithm described 
above. Virtual heading was decoded from the 
activity of the same set of HD cells that was 
used to train the algorithm, this time, how- 
ever, recorded during REM sleep. Changes in 
virtual heading, that is, virtual turns, occurred 
with an angular velocity that was similar to 
that decoded during actual turns in awake 
mice exploring the arena (fig. S4, A and B), 
consistent with previous results (19). Further- 
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more, during REM sleep, HD cells maintained 
a similar correlational structure as that ob- 
served during wakefulness: HD cells that either 
fired or did not fire together in wakefulness 
exhibited the same pattern during REM sleep 
(fig. S4, C and D). To test the relationship be- 
tween rapid eye movements and virtual head- 
ing, we first focused our analysis on rapid eye 
movements that were not preceded by any eye 
movement for at least 400 ms (Fig. 2F) and in 
which both eyes moved by at least 2° in the 
same direction (i.e., conjugated rapid eye move- 
ments; see methods). This allowed us to have a 
sufficiently long baseline to detect potential 
changes in virtual heading that were specif- 


ically associated with the selected conjugated 
rapid eye movements. We refer to these rapid 
eye movements as “leading” eye movements. 
The direction of leading rapid eye movements 
matched the direction of simultaneously re- 
corded virtual turns (Fig. 3, A to C). A CCW 
leading rapid eye movement, for example, 
occurred as the ensemble activity of HD cells 
shifted CCW (Fig. 3B), and vice versa (Fig. 3C). 
Overall, during REM sleep, the direction of 
leading rapid eye movements predicted the di- 
rection of the changes in virtual heading, both 
in each individual mouse (Fig. 3, D to F, and 
fig. S5) as well as in the population (Fig. 3, G 
and H). Furthermore, not only the direction 
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Fig. 3. Leading rapid eye 
movements predict 
decoded head turns during 
REM sleep. (A) Schematic 
of the experimental 
configuration (Same as 

Fig. 2A). (B) Two snapshots 
of the right eye during 

REM sleep taken before and 
after a CCW rapid eye 
movement are shown at the 
top. The red trace illustrates 
the right eye’s horizontal 
position in time. A raster plot 
of the firing of three example 
HD cells (out of the 11 HD 
cells illustrated in Fig. 1D, 
with the same color coding) 
is shown at the bottom. Note 
the CCW shift in heading 
epresentation concomitant 


REM sleep 


example ccw rapid eye mov. 


right eye position 
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movement. (C) Four exam- 
ple episodes illustrating a 
concomitant shift in decoded 
heading and eye position 
during REM sleep. The top 
traces illustrate the horizon- 
tal position of the two eyes 
(red and blue) and decoded 
heading (green). The vertical 
dashed lines indicate the 
onset of leading rapid eye 
movements. Shown at the 
bottom is a raster plot of the 
firing of the 11 example HD 
cells (same as in Fig. 1D). 
(D) The average relative 
position of the right (red) 
and left (blue) eyes for CW 
(left) and CCW (right) 
leading eye movements is 
shown at the top. The aver- 
age decoded heading (green) 
is shown at the bottom. 
Shaded areas are standard 
error of the mean [average 
of 34 traces for CW and 

47 traces for CCW leading 
eye movements]. (E) Scatter 
plot of the amplitude of 
leading right versus left eye movements during REM sleep. The amplitude 
and sign of the decoded head turns during the eye movements are color 
coded. Note the good match between the direction of the rapid eye 
movements and the direction of the decoded head turn. (F) Correlation 
between the amplitude of leading eye movements during REM sleep 
(averaged over both eyes) and the decoded head-turn amplitude. For 

(B) to (F), the representative mouse is the same as that in Fig. 1, C to E. 
(G) Mean traces of horizontal eye position averaged across both eyes for 
CW (left) and CCW (right) leading eye movements (n = 6 mice) are shown at 
the top. The dataset was separated into quintiles based on the amplitude. 
Mean traces of the decoded heading in each quintile based on the 
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amplitude of leading rapid eye movements are shown at the bottom. 

Note that leading rapid eye movements with larger amplitude coincide 

with larger decoded head turns. (H) Summary scatter plot of the amplitude 
of leading right versus left eye movements during REM sleep (n = 330 
from six mice). The amplitude and sign of the decoded head turns during 
the eye movements are color coded. (1) Correlation between the amplitude of 
leading eye movements during REM sleep and the decoded head-turn 
amplitude (n = 330 from six mice). Leading rapid eye movements 
predicted the direction and amplitude of decoded head turns in each 
individual mouse (gray lines) as well as for all mice (red line; vertical 

lines indicate standard deviation). 
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Fig. 4. Small-amplitude follower eye movements represent recentering eye movements. (A) Example 
traces of horizontal eye position (averaged across both eyes) aligned to the onset of first follower rapid 
eye movement after a CW (left) and CCW (right) leading eye movement are shown at the top. Average 
relative positions (mean of both eyes) for first follower eye movements after CW (left) and CCW (right) 


relative movement (average over 132 traces for CW and 107 traces for CCW from six mice) are 


shown in the middle. Average traces of concomitantly decoded heading are shown at the bottom. Note that, 


movement amplitude. 


of the leading eye movements but also their 
amplitude provided information about the 
ongoing internal representation of heading: 
the larger the amplitude of the leading rapid 
eye movement, the larger the angle of the 
simultaneously recorded virtual turn (Fig. 3, 
GtolI). 

What is the relationship between rapid eye 
movements that follow leading eye movements 
and the virtual heading of the animal? We refer 
to these eye movements as “followers,” that is, 
those that occur less than 400 ms after another 
rapid eye movement. The above results (top 
panels of Fig. 3, D and G) show that the average 
position of the eye following a leading eye 
movement progressively returns to the original 
position occupied before the leading move- 
ment, as if recentering the eye [400 ms after a 
leading eye movement, the eye returned to 
34.1 + 3.3% (mean + SEM) relative to its peak 
position after the leading eye movement]. By 
contrast, the simultaneously decoded virtual 
heading did not (bottom panels of Fig. 3, D 
and G). If the recentering of the eye is me- 
diated by followers, the direction of these rapid 
eye movements should be opposite relative to 
that of leading eye movements and to the 
ongoing changes in virtual heading. Indeed, 
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on average, first followers occur in the opposite direction as compared to the preceding leading eye 
movement and to the decoded head turn. Shaded areas represent standard error of the mean. The time scale 
of the top panels covers a larger interval to include the preceding leading eye movement. (B 
(heatmap) between the decoded head-turn amplitude and amplitude of rapid eye movement (in bins; 
averaged for both eyes; y axis). The first column from the left shows leading eye movements (n = 330 from 
six mice), the second column shows first followers (n = 239 from six mice), and the third column shows 
the rest of the followers (n = 970 from six mice). Note the increased positive correlation with increasing eye 


Correlation 


the direction of followers that occur immedi- 
ately after a leading eye movement (first fol- 
lowers) were, on average, opposite to the 
direction of the preceding leading eye move- 
ment (Fig. 4A and fig. S6A) and thus opposite 
to the direction of the ongoing head turns 
(Fig. 4, A and B, and fig. S6, B and C). Overall, 
the direction of followers was opposite to the 
direction of changes in virtual heading, but 
for the small fraction of followers (4.3%) 
with the largest amplitudes (>10°; Fig. 4B 
and fig. S6, D and E) that, like leading eye 
movement, matched the directions of virtual 
head turns. 

Taken together, these data demonstrate a 
tight relationship between rapid eye movements 
and the internal representation of the heading 
of the animal during REM sleep (Fig. 4B). Not 
only does the direction of leading rapid eye 
movements predict the direction of change in 
virtual heading, but their amplitude predicts 
the magnitude of the change. Conversely, fol- 
lower eye movements recenter the eye. Thus, 
our results demonstrate that rapid eye move- 
ments provide a readout of the internal rep- 
resentation of heading in the sleeping brain. 
The coordination between rapid eye move- 
ments during REM sleep and the HD system 


suggests that shifts in virtual heading are part 
of a globally orchestrated representation of 
“virtual navigation” by the sleeping brains 
rather than the result of some uncorrelated 
random walk of the HD system (20). 

How do eye movements that occur during 
REM sleep map onto eye movements observed 
during wakefulness? Leading rapid eye move- 
ments may correspond to saccades because 
both match the direction of the ongoing head 
turn, virtual or real (21, 22, 25). Follower eye 
movements, however, appear unlike any eye 
movement in the awake animal. In the awake 
mouse, the recentering of the eye after a sac- 
cade is mediated by image stabilizing reflexes— 
the vestibular and optokinetic reflexes—that are 
engaged by the ongoing head turn (27, 22, 25). 
In the sleeping, motionless animal, however, 
the sensory periphery that triggers these re- 
flexes is not engaged. It is conceivable that 
recentering eye movements during REM sleep 
may still be triggered by activity in the ves- 
tibular nuclei. Such activity, although indepen- 
dent of the sensory periphery, could be part 
of the globally orchestrated virtual navigation 
mentioned above. 

Whether rapid eye movements in REM 
sleep reveal the subjects’ direction of gaze in 
the imagery of dreams has been debated since 
the discovery of this sleep phase and its asso- 
ciation with vivid dreams (7-13). Because of 
the lack of objective measures that assess the 
content of dreams, initial studies have led to 
conflicting results, leading to the conclusion 
that rapid eye movements are likely uncorre- 
lated with the cognitive activity of the sleeping 
brain (72, 73). More recently, studies performed 
on human patients or animal models that par- 
tially enact their dreams because of reduced 
muscle atonia during REM sleep have led toa 
reevaluation of the original hypothesis (26, 27). 
In these studies, at least some rapid eye move- 
ments appeared to be coordinated with the 
direction of the behavior enacted during REM 
sleep. However, the extent to which the co- 
ordination between eye and body movement is 
a result of the pathological or experimentally 
reduced atonia still remains unclear. 

By harnessing the HD system of the rodent 
and the correlation between orienting head 
and eye movements in awake animals, we have 
established a clear relationship between changes 
in virtual heading and rapid eye movements 
during REM sleep. Thus, our results indicate 
that rapid eye movements provide an external 
readout of an internal cognitive process that is 
occurring during REM sleep, namely the change 
in virtual heading. Our results further suggest 
the existence of a globally coordinated activity 
among distinct systems in the sleeping brain 
during REM sleep, a coordination that may 
underlie the realistic and vivid experience of 
dreams. Understanding the neurophysiological 
mechanisms of this coordination will give us 
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insight into the organization of the brain’s 
generative model of the world (28, 29). 
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Structurally integrated 3D carbon tube grid-based 
high-performance filter capacitor 


Fangming Han}, Ou Qian’*+, Guowen Meng’*, Dou Lin’, Gan Chen’, Shiping Zhang”, 
Qijun Pan’, Xiang Zhang’, Xiaoguang Zhu’, Bingging Wei** 


Filter capacitors play a critical role in ensuring the quality and reliability of electrical and electronic 
equipment. Aluminum electrolytic capacitors are the most commonly used but are the largest filtering 
components, limiting device miniaturization. The high areal and volumetric capacitance of electric 
double-layer capacitors should make them ideal miniaturized filter capacitors, but they are hindered by 
their slow frequency responses. We report the development of interconnected and structurally 
integrated carbon tube grid-based electric double-layer capacitors with high areal capacitance and rapid 
frequency response. These capacitors exhibit excellent line filtering of 120-hertz voltage signal and 
volumetric advantages under low-voltage operations for digital circuits, portable electronics, and 
electrical appliances. These findings provide a sound technological basis for developing electric double- 
layer capacitors for miniaturizing filter and power devices. 


ilter capacitors play a critical role in en- 

suring the quality and reliability of electrical 

and electronic equipment, especially mem- 

ory devices and computers (J, 2). Circuit 

filtering has been dominated by alumi- 
num electrolytic capacitors (AECs), which, un- 
fortunately, are always the largest electronic 
component owing to their low volumetric 
capacitances (1, 3, 4). Therefore, developing 
new types of small-sized filter capacitors is 
vital to meet the current and emerging de- 
mands of digital circuits and portable electronics. 
The high areal and volumetric capacitance of 
electric double-layer capacitors (EDLCs) should 
make them an ideal candidate, but this is 
hindered by their slow frequency response 
(<1 Hz) (2, 5, 6). Miller et al. demonstrated the 
feasibility of using graphene-based EDLCs for 
circuit filtering (7). They revealed that the 
slow response of EDLCs could be modulated 
to meet the needs of circuit filtering applica- 
tions by manipulating electrode materials 
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and structures to enhance electrical and ionic 
conductivities. 

EDLCs can be used in filter circuits to con- 
vert alternating current (ac) into direct current, 
however, they are required to have a high- 
frequency response to smooth the leftover ac 
ripples (7-12). The electrode materials must 
have superior electrical conductivity and fast 
ionic response to achieve rapid frequency 
performance (13). Additionally, the EDLCs 
are expected to have a high volumetric (Cy) 
and areal (C,) specific capacitance. C, is a 
more accurate evaluation index because the 
electrode thickness would be limited to ensure 
the rapid distribution of ions onto the inner 
surfaces to secure the high-frequency response. 
For a given capacitance, a low Cy, will require 
increasing the active and inactive materials 
in the EDLC (2, 14), resulting in a low Cy. 
Currently, EDLCs mainly use nanostructured 
carbon-based electrodes (15, 16). To achieve a 
high-frequency response, such EDLCs can only 
use low loading of active materials, resulting 
in a subordinate Cy (9, 16, 17). This is because 
a high loading of active materials, such as 
graphene or carbon nanotube (CNT) arrays, 
tends to agglomerate into multilayer forms 
or bundles, leading to increased resistance 
to ion distribution and, hence, slow response 
(18, 19). Although various approaches, such 
as using vertically structured and macropo- 
rous graphene, have been reported, these 
issues remain unresolved (J, 3, 17). Therefore, 
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Fig. 1. Synthesis and characteristics of the 3D-CNT@CT grid. (A) Schematic illustration of synthetic procedures 


of the 3D-CNT@CT grid. (B) Typical cross-sectional view, (C) top view, and (D) enlarged cross-sectional view 
SEM images of 3D-CNT@CT. (Insets) High-magnification SEM images. (E) High-resolution TEM image of two 
adjacent vertical CTs connected by lateral CTs with filling CNTs. (Inset) Magnified TEM image of a CNT. 
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Fig. 2. Assembly structure and electrochemical impedance spectra of the 3D-CTG capacitor. 


(A) Schematic of EDLC assembly structure. (B) Complex plane plot of the 3D-CTG-based EDLCs. (C) Phase 


angle versus frequency of 3D-CT-10-, 3D-CNT@CT-10-, 3D-RCT-10-, and 3D-RCT-12—-based EDLCs and 
commercial AEC (Panasonic, Japan, 6.3 V/330 uF). 


it is envisaged that a high-performance 
carbon-based filter EDLC electrode must 
have superior structural stability to maintain 
its high volumetric and areal capacitances 
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and fast ion migration under operando 
conditions. 

Here, we demonstrate the fabrication of 
high-performance line-filtering EDLCs using 


athree-dimensional (3D) carbon tube (CT) grid 
(3D-CTG) as the electrode. This grid, with truly 
interconnected and structurally integrated ver- 
tical and lateral CTs (denoted as 3D-CTs), can 
provide high structural stability, superior 
electrical conductivity, and an effective open 
porous structure and was synthesized by a 
chemical vapor deposition (CVD) method with 
the aid of a 3D interconnected nanoporous 
anodic aluminum oxide (3D-AAO) template 
(see materials and methods and figs. S1 and 
$2 in the supplementary materials) (19-21). 
The 3D-AAO template with lateral pores con- 
necting the adjacent vertical channels was 
obtained by the anodization of Al foils with 
Cu impurity to form highly ordered vertically 
aligned nanochannels with Cu-contained nano- 
particles embedded in the channel walls and 
subsequent selective wet-chemical etching of 
the nanoparticles (20, 21). The 3D-CTs were 
constructed after growing CTs inside the 
3D-AAO nanoporous template by pyrolyz- 
ing acetylene and removing the template. To 
increase the specific surface area and further 
enhance Ca, the 3D-CTs can be modified, as 
exemplified by filling with much-smaller- 
diameter CNTs within the vertical and lateral 
CTs (3D-CNT @CT) by means of the Ni catalyst- 
assisted CVD method (Fig. 1A) (22), or surface- 
treated with KMnO, (3D-RCT, i.e., 3D-CT with a 
rough surface) (fig. S3) (23). 

The synthesized 3D-CTG film is flexible, with 
a diameter of 54 mm and a uniform thickness 
of 10 um, and can be controlled using 3D- 
AAO templates of different sizes and thick- 
nesses (Fig. 1B and fig. S4). Scanning electron 
microscopy (SEM) images reveal that the 3D- 
CTG films consist of uniformly distributed 
vertically aligned CTs, which are intercon- 
nected by smaller lateral CTs to form a 3D 
grid, and for 3D-CNT@CT, much-smaller- 
diameter CNTs were grown inside the vertical 
CTs (Fig. 1, C and D, and fig. S5). Transmission 
electron microscopy (TEM) images show that 
the vertical and lateral CTs are structurally 
integrated through chemical means rather 
than physical attachments (Fig. 1E and fig. S6). 
Moreover, the rough outer and inner surfaces 
of the 3D-RCT are also demonstrated (fig. S7). 

EDLCs were assembled with two 3D-CTG 
electrodes of identical thickness [denoted as 
3D-CT-10, 3D-CNT @CT-10, 3D-RCT-10, or 3D- 
RCT-12, where the numbers represent thickness 
(in micrometers)] separated by a nonwoven 
membrane with 1 M H,SO, electrolyte (Fig. 2A). 
The Nyquist plot of the impedance obtained 
from the EDLCs (Fig. 2B) displays an imaginary 
response (Z") almost vertical to the real axis, 
indicating a near-perfect capacitive characteristic 
and no porous electrode behavior (J, 24, 25). 
Also, no features are associated with the series 
of passive layers, characterized by the high- 
frequency semicircle (1, 25). The Bode plots 
(frequency dependence of the phase angle) of 
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Fig. 3. Frequency-dependent capacitances of A 
3D-CTG—based EDLCs. (A and B) Cy and Cy 
versus frequency of 3D-CT-10-, 3D-CNT@CT-10-, 
3D-RCT-10-, and 3D-RCT-12-based EDLCs and 
commercial AEC. (C) Comparison of the Ca 

at 120 Hz of 3D-CT-10-, 3D-CNT@CT-10-, 3D-RCT- 
10-, and 3D-RCT-12-based EDLCs and other 
reported electrochemical capacitors used in the 
ac filter circuits with the phase angle near or less 
than -80°. The abbreviations and corresponding 7 bs 
phase angles in (C) are as follows: AEC (Pana- 10° 10° 10' 107 10° 10° 
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sonic, -83°); VG (vertical graphene, -82°) (1); Frequency (Hz) 
UGN (unzipped graphene nanoribbons, -85°) (9); 
G/CNT (grapheme/CNT, -81.5°) (15); ErGO oe 

304 YY Our work 


(electrochemical reduction of graphene oxide, 
-84°) (7); POG (perpendicularly oriented 
graphene, -82°) (5); TizC2/PEDOT:PSS 
[TizC2/poly(3,4-ethylenedioxythiophene): 
poly(styrenesulfonate), -79°] (13); CNT (-81°) 
(16); PEDOT:PSS [poly(3,4-ethylenedioxythiophene): 
poly(styrenesulfonate), -83.6°] (8); VOGN 
(vertically oriented graphene nanosheets, -85°) (11); 
and EOG (edge-oriented graphene, -80.6°) (27). & 
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Fig. 4. Performance characteristics of single EDLC and EDLCs in series. (A) Nyquist plots. (B) Phase angle 
versus frequency. (C) Real and imaginary parts of capacitance versus frequency. (D) DF versus frequency. 

(E) Filtering results of the six EDLCs in series in comparison with AECs. (F) A volumetric comparison of 3D-CTG- 
electrode EDLCs with commercial AECs (red triangles; Panasonic, Nichicon, and Nippon, Japan). 
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the EDLCs and commercial AECs (Fig. 2C and 
fig. S8) were used to evaluate their frequency 
responses. An ideal EDLC should have a 
minimum phase angle of -90° (1). The 3D- 
CTG-based EDLCs and the commercial AEC 
can achieve phase angles of less than —80° when 
the frequency is <200 Hz. At a phase angle of 
-—45°, the measured cutoff frequencies (f_45; 
resistance and capacitive reactance are equal, 
defining the boundary between resistive and 
capacitive behavior) of 3D-CT-10-, 3D-CNT@ 
CT-10-, and 3D-RCT-10-based EDLCs and the 
commercial AEC are 2634, 2360, 1332, and 
1502 Hz, respectively. These results demon- 
strate that the EDLCs built with 3D-CTGs 
have frequency responses similar to that of 
the commercial AEC. A low phase angle at 120 Hz 
is avital indicator for practical applications 
of EDLCs as ac line-filtering capacitors. This 
angle is lower than —81° for the 3D-CTG-based 
EDLCs, similar to the phase angle of the com- 
mercial AEC (—83°, Panasonic Japan, 330 uF, 
6.3 V). The excellent and stable ac line-filtering 
performance is also demonstrated (fig. S9). 
The frequency dependences of the areal 
and volumetric capacitances (Fig. 3, Aand 
B) prove that the 3D-CTG-based EDLCs can 
deliver much higher capacitances at all fre- 
quencies than those of AECs, and only slightly 
decreased capacitances were observed when 
the frequency increased from 107’ to 10° Hz. 
The Cy, of the 3D-RCT-12-based EDLCs can 
reach 2.81 mF cm ” at 120 Hz, a higher areal- 
specific capacitance than that of other filter- 
ing EDLCs reported to date with a phase 
angle less than -80° (Fig. 3C and table S1) 
(1, 5, 7-9, U1, 13, 16, 26, 27). The Cy at 120 Hz can 
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achieve 1.36 F cm? for 3D-RCT-10 electrodes. 
These excellent performances—showing the 
enhancement of structural stability, endorse- 
ment of electrical conductivity, and improvement 
of ion response—illustrate the superiority of 
the truly interconnected and structurally in- 
tegrated 3D-CTGs. 

To meet the voltage requirements of the ac 
line filtering, taking the 6.3-V AEC as a bench- 
mark, six EDLCs based on the 1.4-cm-diameter 
3D-CT electrodes were assembled in series 
(fig. S10). The EDLCs in series also show ideal 
capacitive behavior (fig. S11). To eliminate 
the influence of the contact resistance, a four- 
electrode test method was used to evaluate 
the electrochemical impedance spectroscopy 
of the EDLCs. Complex plane plots of a single 
EDLC and the six-EDLC series (Fig. 4A) ex- 
hibit close to 90° slopes at a high frequency, 
which is also characteristic of a pronounced 
capacitive behavior. Typically, the increase of 
equivalent series resistance (ESR) caused by 
capacitors in series would theoretically lead to 
the degradation of frequency response perform- 
ance. However, the curves of a single device and 
the six EDLCs in series showing the frequency 
dependence of the phase angle in the Bode plots 
almost exactly overlap below 10* Hz (Fig. 4B), 
and the phase angle at 120 Hz reaches -82°% 
suggesting that the sixfold increase of ESR 
does not slow the frequency response. This is 
mainly because the rise of ESR is accompanied 
by a corresponding augmentation in capac- 
itive reactance (X,) (table $2), which would 
result in high-voltage line-filtering EDLCs. 

In the complex model of the capacitance 
(Fig. 4C), C'(f) represents the accessible capac- 
itance at the corresponding frequency, while 
C"(f) accounts for energy dissipation due to 
irreversible processes caused by the diffusion 
and polarization (28). The 3D CT array structure 
allows high capacitance to be maintained to 
frequencies exceeding 120 Hz, and the capac- 
itance of a single EDLC is almost six times that 
of the six EDLCs in series, which is consistent 
with the law of capacitors in series. Especially 
for the single EDLC and six EDLCs in series, 
C"'(f) goes through maximum values at the same 
frequency, fo, defining a time constant as To = 
1/fo = 0.75 ms, which further confirms that high- 
performance and high-voltage filtering capacitors 
can be achieved by connecting EDLCs in series. 

In addition, the high-frequency response 
can also be demonstrated by the variation of 
C7/C and dissipation factors (DFs) with fre- 
quency. The C’/C value approaches 1 below 
a kilohertz (fig. S11C), indicating the ideal 
capacitive behavior in this frequency range. 
The DFs have a very small value at 120 Hz 
(Fig. 4D), suggesting a slight loss characteris- 
tic of the devices. Although there is still a gap 
with an AEC, a smaller DF can be achieved 
by adjusting the thickness of the 3D-CTG. 
More critically, the EDLCs in series also ex- 
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hibit excellent ac line-filtering performance 
(Fig. 4E), and the ripple current is ~0.014 A at 
120 Hz (calculations given in the supplemen- 
tary materials). 

It is critical to compare the volume of the 
EDLCs to the comparably rated AECs (Fig. 4F 
and fig. S12). The calculated capacitance per 
volume (Cy, in farads per cubic centimeter) for 
the exemplified 3D-RCT-based EDLC in the 
aqueous electrolyte is 0.21/V’, where V is 
the voltage rating (all components included). 
The volumetric capacitances are much higher 
than a commercial AEC when the operation 
voltage is below ~10 V. To increase operating 
voltages, the organic electrolyte (tetraethy- 
lammonium tetrafluoroborate in acetonitrile) 
was used (figs. $13 and S14; calculations given 
in the supplementary materials). The volu- 
metric capacitance to rated voltage increases 
to 0.66/V”, and then volumetric advantages 
over an AEC are expanded to voltages below 
~25 V, exhibiting considerable advantages in 
low-voltage operations for digital circuits, por- 
table electronics, and small appliances and 
meeting the roadmap of low operating voltage 
in high-performance devices and systems (29). 
The comparison of CV/v value (the amount 
of electric charge stored per volume, V is 
voltage, and v is volume) shows a similar 
result (fig. S15). It should be noted that given 
the low operation voltage of an individual 
EDLC, more EDLCs must be connected in 
series to reach a higher-rated voltage. Further 
optimization to achieve volumetric advantages 
at higher voltages is also possible through 
applying in-plane interdigitated electrode con- 
figuration and using solid or ionic electrolytes 
(25, 30). The developed 3D-CTG-based EDLCs 
to replace the bulky AECs would benefit the 
miniaturization of portable electronics, mobile 
power supply, electrical appliances, and dis- 
tributed energy harvesting and power supply 
on the Internet of Things, greatly promoting 
the development of high-performance digital 
circuits and emerging electronic technologies. 

In this study, we have successfully synthe- 
sized freestanding thin films with 3D truly 
interconnected, structurally integrated CT grids. 
The freestanding thin films of 3D-CT, 3D- 
CNT@CT, and 3D-RCT have been innovatively 
used to fabricate EDLCs with demonstrated 
effectiveness to resolve the critical bottleneck 
issues of slow frequency response of the exist- 
ing carbon-based EDLCs as ac line-filtering 
capacitors and the low Cy, and Cy faced by 
commercial AECs. These encouraging results 
pave the way for the miniaturization of fil- 
tering capacitors with high capacitance using 
carbon-based electrodes, which is essential for 
current and emerging portable electronics. 
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FAUNAL DECLINE 


Collapse of terrestrial mammal food webs since 


the Late Pleistocene 


Evan C. Fricke’?**, Chia Hsieh’, Owen Middleton‘, Daniel Gorczynski', Caroline D. Cappello®, 
Oscar Sanisidro®, John Rowan’, Jens-Christian Svenning®, Lydia Beaudrot? 


Food webs influence ecosystem diversity and functioning. Contemporary defaunation has reduced food 
web complexity, but simplification caused by past defaunation is difficult to reconstruct given the 
sparse paleorecord of predator-prey interactions. We identified changes to terrestrial mammal food 
webs globally over the past ~130,000 years using extinct and extant mammal traits, geographic ranges, 
observed predator-prey interactions, and deep learning models. Food webs underwent steep regional 
declines in complexity through loss of food web links after the arrival and expansion of human 
populations. We estimate that defaunation has caused a 53% decline in food web links globally. Although 
extinctions explain much of this effect, range losses for extant species degraded food webs to a 
similar extent, highlighting the potential for food web restoration via extant species recovery. 


uman activities have caused global ex- 

tinction or local extirpation of many 

animal species (J). Habitat loss, direct 

exploitation, invasive species, and other 

global change drivers have contributed 
to recent defaunation (2), which in turn has 
caused cascading impacts on biodiversity and 
ecosystem functioning through disruption of 
food webs (J, 3, 4). Yet defaunation, and the 
potential for food web disruption, is not only a 
contemporary phenomenon. Declines in spe- 
cies diversity since the last interglacial period 
(~130,000 years ago) are well known for groups 
such as terrestrial mammals (5, 6). Although 
there are persistent discussions on the relative 
roles of humans, climate, and their interactions 
as drivers of these extinctions, the spatiotemporal 
pattern of declines strongly suggests a major 
human role (5, 7-9). The past, and ongoing, 
selective loss of large-bodied mammals has 
caused a marked downsizing of mammal as- 
semblages relative to the preceding 30 million 
years (7). The strong ecological effects of 
human-induced food web disruption observed 
in recent decades (10) raise questions regard- 
ing the global magnitude and timing of food 
web changes that have resulted from extinc- 
tions, local extirpation, and species introduc- 
tions throughout human history and prehistory. 
However, evidence of species-specific predator- 
prey interactions has seldom been preserved in 
the fossil record (JD, and the scarcity of fossil 
evidence has prevented direct quantification of 
past defaunation’s effects on food webs. 
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When direct observations are unavailable, 
researchers can construct food webs by model- 
ing predator-prey interactions; ecologists com- 
monly use two approaches. First, trait-matching 
models examine how an ecologically relevant 
trait of predator or prey species relates to 
observed interactions. Body mass is recog- 
nized as a key trait, and the ratio of predator 
to prey body mass is a central determinant of 
food web interactions (12-/4). By fitting a 
model using interaction observations and body 
mass ratios, researchers can estimate interac- 
tions given masses of candidate predator and 
prey species (15-18). Second, phylogenetic mod- 
els rely on closely related species preying on, or 
being preyed on by, other sets of closely related 
species. Using observed interactions from the 
field or literature, researchers may estimate that 
a predator that is known to prey on one mem- 
ber of a genus would also prey on another 
member of the genus (79, 20). A third approach, 
which extends the trait-matching approach 
to multiple traits and complex relationships 
among them, uses machine learning algorithms 
trained on observed interactions to predict 
interactions among candidate species on the 
basis of their traits (27). For each of the three 
approaches, researchers use a simplifying as- 
sumption that species have similar interaction 
determinants over space and time. This allows 
researchers to predict food webs given any 
scenario of species composition, which can 
include reconstructing food webs that likely 
occurred under past species composition, gen- 
erating current food webs in regions where 
food webs have not been recorded directly, and 
forecasting future food webs under altered 
species composition. Most existing applications 
have reconstructed food webs over time at 
regional scales using body mass data (15, 17, 22). 

Here, we provide a global reconstruction of 
terrestrial mammal food webs by modeling 
predator-prey interactions through a synthesis 
of data on observed interactions and species 


traits. Although food webs involving only mam- 
mals represent just a portion of the food web 
encompassing all species, a spatiotemporal re- 
construction of mammal food webs is possible 
because of the strong fossil record and data on 
modern predator-prey interactions available for 
this group (23). We assembled a global database 
encompassing >17,000 unique predator-prey 
records for co-occurring pairs of extant mam- 
mal species from the scientific literature and 
existing databases of predator-prey interac- 
tions (24, 25). We used a synthesis of trait 
databases (26) covering extant and extinct 
mammals to characterize each species based 
on variables related to morphology, life his- 
tory, and ecology (27). To assess the ability 
to predict predator-prey interactions globally 
using each of the approaches described above, 
we built a model using 75% of the records and 
tested its predictive performance on the 25% 
of records withheld (table S1). The deep learn- 
ing model strongly outperformed the commonly 
applied approaches that are based on body mass 
ratio or genus-level information; when fitted 
using the full dataset, it achieved 90% accu- 
racy [area under curve (AUC) = 0.93; kappa = 
0.52; true skill statistic (TSS) = 0.69]. 

To illustrate how the deep learning model 
can be used to reconstruct food webs, Fig. 1 
demonstrates, for several locations, food webs 
generated under alternate scenarios of mam- 
mal species composition. The two scenarios 
shown here represent extant species’ current 
ranges or ranges of both extant and extinct 
mammals as they would occur today in the 
absence of human-linked extinction, local 
extirpation, or introduction from the Late 
Pleistocene to the present day (6). The range 
reconstructions leverage historical records and 
fossil evidence to model ranges while account- 
ing for range shifts due to climatic changes 
since the Late Pleistocene (27). Comparisons 
among species composition scenarios allow us 
to assess how defaunation at sites worldwide 
has led to the simplification of food webs. 

Having developed the capacity to generate 
food webs given alternate scenarios of species 
composition, we sought to quantify how ex- 
tinctions have affected food webs over time 
and to assess food web resilience to extinction. 
We used estimates of extinction dates from the 
fossil record (5) and focused on species extinc- 
tion at regional scales because detailed tempo- 
ral estimates of range changes are unavailable 
for most species. Figure 2 shows change over 
time, averaged within each region, in two basic 
measures of food web complexity: the number 
of species participating in a food web, and the 
number of food web links (28). To examine 
whether a reduction in the number of species 
present led to a decline in food web proper- 
ties different from that expected by chance, 
we compared observed changes in food web 
complexity to a null model in which the same 
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Fig. 1. Comparing terrestrial mammal food webs under current species composition and under species composition unaffected by extinction, local 
extirpation, and introduction. Lowercase letters correspond to locations on the map and illustrations of species composition, with color showing species that are 
extant at the site and grayscale indicating species that are extirpated or extinct. 


number of extinctions was simulated, but in 
which the identities of mammal extinctions 
were randomized. This allowed us to distin- 
guish between two alternatives regarding the 
resilience of food webs. If food webs have been 
resilient under extinction pressure either as a 
result of trophic redundancy (29) or because 
extinction is less likely for highly interactive 
species, then food web complexity would de- 
cline less than what would be expected by 
chance under random extinction. Alternatively, 
disproportionately greater extinction of highly 
interactive or functionally distinct species (30) 
would cause greater declines than expected by 
chance, indicating food web collapse (37). Thus, 
although food webs would be expected to de- 
cline to some extent given reduced species 
richness, the null model allows us to evaluate 
quantitatively whether observed food webs ex- 
hibit resilience (i.e., declining statistically less 
than the null model as reflected by nonover- 
lapping confidence intervals), a proportionate 
decline (i.e., declining as expected by chance), 
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or food web collapse (i.e., declining more than 
expected by chance). Note that the food webs 
include only species that interact with at least 
one other mammal species; thus, the number 
of species participating in a food web may de- 
cline faster than the number of species present 
if a species that is present no longer participates 
in the mammal food web (e.g., a species no 
longer co-occurs with its mammalian predator). 

We found that in most cases, the observed 
declines in each region either started to be 
statistically more severe than the null expec- 
tation, or became more severe still, after the 
arrival of Homo sapiens (8) or European col- 
onization (Fig. 2). Notably, the two regions 
occupied by hominins prior to Homo sapiens 
(Africa and Eurasia) exhibited the smallest- 
magnitude declines, and these declines differed 
little from the null expectation until industri- 
alization. One potential explanation for the 
relative resilience of African and Eurasian 
food webs compared to other regions may be 


more gradual coevolution among hominins 


and other mammals in these regions (32). Com- 
paring observed extinctions to the null model, 
we estimate that food webs globally have lost 
57% more links and 60% more species than 
would be expected by chance. Extinct species 
on average interacted with more species than 
did extant species (fig. SIA; F),4001 = 180, P « 
0.0001), and their loss has contributed to mam- 
mal food web collapse. 

We next sought to understand how contem- 
porary food webs have been shaped by past 
extinctions and range changes. We compared 
food webs constructed under alternate scenar- 
ios of species composition to determine the 
cumulative effects of extinction and range 
changes. To isolate the effect of extinction, we 
compared food webs composed of only extant 
species to food webs composed of both extant 
and now-extinct mammals. In both cases, we 
generated food webs as though all species 
filled their natural ranges (27), with ranges 
unaffected by range loss or anthropogenic 
species introductions (6). The largest declines 
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in species and links due to extinction alone were 
in the Americas, Australia, and Madagascar 
(Fig. 3, A and B). In these areas, the con- 
comitant decline in links per species indicates 
that extinct species were disproportionately 
important for food web complexity (Fig. 3A 
and fig. SIA). Attempts to fully restore food 
webs in these areas would require replace- 
ments by functionally equivalent species native 
to other regions (33). At the region scale, ex- 
tinctions have caused average declines of 7% to 
73% in the number of participating species and 
average declines of 11% to 83% in food web 
links (Fig. 3B). Globally, we estimate that ex- 
tinctions alone have caused a 20% decline in 
the number of species participating in terres- 
trial mammal food webs and a 29% reduction 
in food web links. 

To assess the net effects of species range 
changes on food web complexity, we compared 
food webs consisting of extant species in their 
current ranges to food webs constructed for 
extant species as though their ranges were 
unaffected by local extirpation or anthropo- 
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genic species introduction (6). We found that 
further losses were widespread, also greatly 
affecting areas whose food webs were less 
affected by extinctions, including Africa and 
southern Eurasia (Fig. 3). Relative declines due 
to range changes, considering extant species 
only, were 7% to 76% across regions in the 
number of participating species and 13% to 
82% in food web links (Fig. 3A), amounting to 
a further 20% global decline in participating 
species and 35% in links. Declines in links per 
species due to range changes indicate that spe- 
cies that experienced distributional contractions 
were disproportionately interactive within their 
former ranges (Fig. 3A and fig. SIB; Fysz65 = 
114.5, P « 0.0001). When cumulative effects of 
extinctions and range changes were combined 
at the region scale and compared against the 
scenario where both extinct and extant species 
filled their natural ranges, average declines in 
participating species ranged from 27% to 94% 
and in links from 45% to 97% (Fig. 3B). At the 
global scale, we estimate that late-Quaternary 
defaunation has resulted in mammal food webs 


on average consisting of 35% fewer species 
and 53% fewer links. 

Lastly, we considered how the potential fu- 
ture extinction of endangered mammals would 
affect food webs globally. Relative to current 
species composition, the largest further losses 
in the number of participating species and the 
number of food web links would occur in areas 
including the Arctic and tropical and sub- 
tropical regions of Africa and Asia (Fig. 3A). In 
most areas where food webs are threatened by 
species loss, endangered species extinction 
would decrease the number of links per spe- 
cies (Fig. 3A). In other words, endangered 
species are central to preserving food web 
complexity in such areas. Exceptions include 
equatorial central Africa and parts of south- 
east Asia, where endangered species extinc- 
tion would reduce the number of species and 
food web links but would not substantially 
alter links per species. Generally, endangered 
species interact more broadly than extant non- 
endangered species (fig. SIC; Fy, 3765 = 19.0, P« 
0.0001), indicating their greater structural 
importance within food webs. Across regions, 
endangered species extinction would cause 
10% to 67% further reductions in participating 
species and 15% to 80% in links beyond that 
incurred by existing defaunation (Fig. 3A), 
amounting to a further 14% global reduction 
in participating species and 21% reduction in 
food web links. 

Our reconstruction of terrestrial mammal 
food webs allowed us to estimate the global 
magnitude of mammal food web collapse since 
the Late Pleistocene. We found that although 
only ~6% of terrestrial mammal species have 
gone extinct since the Late Pleistocene (6), 
more than half of mammal food web links have 
disappeared. Although much of the global food 
web simplification has resulted from extinc- 
tions that occurred centuries to millennia ago, 
range contractions in surviving species explain 
a similar magnitude of simplification. Con- 
trolled and natural experiments show that 
food web complexity supports ecosystem 
resilience and functioning (J0, 34, 35), and 
ecological network simplification reduces eco- 
system functioning (36). We found that the 
species most affected by defaunation are 
among the strongest contributors to food 
web complexity. Recovering food web com- 
plexity could be achieved with natural recolo- 
nization (37) and reintroduction (38, 39) of 
native mammals to their historic ranges, or 
with non-native functional analogs where nec- 
essary and appropriate (33, 39). Critical roles 
played by species affected by range contrac- 
tions and recognized as endangered further 
underscore the need for their conservation 
to sustain food webs, as well as the strong 
potential for the restoration of food webs in 
the Anthropocene through recovery of these 
species to their historic ranges (40). 
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Fig. 3. Attributing change in food webs to extinction and range changes 
and assessing threats due to species endangerment. (A) Color scale 
indicates relative percent change in the number of participating species, number 
of links, and links per species due to extinction, the further percent change due 
to current range losses and species introductions, and the percent change 


relative to current species composition under potential future extinction of 
endangered species. Note that relative percent change values can sum to 
>100%. (B) Region-level absolute percent changes, with +1.96 standard error 
bars in gray, presented relative to food webs under species composition 
unaffected by extinction or range changes. 
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POSTDOCTORAL POSITION IN 
HOST-PATHOGEN-VECTOR INTERACTIONS 
Department of Internal Medicine and 

Department of Microbial Pathogenesis 


The Laurent-Rolle lab at Yale University, New Haven, CT is now accepting 
applications for Postdoctoral associates in the area of host-pathogen- 
vector interactions to start as soon as possible. 


We are interested in how RNA viruses evade the host immune system 
and how host-responses to virus infections prevent or enhance infections. 
There are multiple ongoing projects that are interdisciplinary and highly 
collaborative including studies on SARS-CoV-2, Zika virus and Eastern 
Equine Encephalitis virus. Specifically, we are interested in 1) the 
mechanisms of actions of interferon stimulated genes in host resistance 
to viral infections, 2) dissecting the molecular mechanisms utilized by 
pathogenic viruses to antagonize host immune responses, 3) the 
development of vaccines and novel therapeutics against viral infections, 
and 4) identifying biomarkers of viral infections for early diagnosis. 


Eligible candidates should have, or be close to completing, a Ph.D. or 
M.D in immunology, virology, molecular biology, or related fields. Highly 
motivated individuals with experience in innate immunity, protein 
biochemistry, and animal experimentation are encouraged to apply. 


Interested applicants should send a cover letter explaining why you think 
this lab would be a good fit for you, a CV, and contact information for three 
references to Dr. Maudry Laurent-Rolle at maudry.laurent-rolle@yale. 
edu. The subject of the email should be “Postdoctoral Position in Host- 
Pathogen interaction”. 


Yale University is an affirmative action, equal opportunity employer. 
Applications from women and minorities are encouraged. 
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MOLECULAR MEDICINE AND GENETICS 


The Center for Molecular Medicine and Genetics in the Wayne State University 
School of Medicine (http://genetics.wayne.edu/) is expanding its high- 
impact research unit and is seeking outstandmg candidates who are pursuing 
fundamental and/or translational problems that would complement existing 
strengths represented in the Center. Particular foci of interest are: (1) human 
genetics and genomics and (2) metabolism and inflammation, although we are 
interested in outstanding candidates in other areas. 


This tenure-track position can be filled at any level. A competitive start-up 
package will be provided, commensurate with the candidate’s experience 
and achievement. Candidates will be expected to establish an internationally 
recognized, extramurally funded research program and to participate in graduate 
and medical teaching and service. Opportunities for scientific collaboration are 
available within the Center and externally in multiple areas. 


Candidates holding a PhD and/or MD, or equivalent degree(s) with a strong 
record of research productivity are encouraged to apply. For further information 
and to submit your application, please visit our online application site at https:// 
jobs.wayne.edu/ (Posting #046677). Candidates should include a cover Tetter, 
research description (1-3 pages) and teaching philosophy along with their 
curriculum vitae. 


Our Center has a long history of providing high quality education and pioneering 
research for medical students, graduate students in our PhD and MS programs, 
and students in our Genetic Counseling program. A strength of our graduate 
programs is that, in addition to developing a high level of expertise in a research 
discipline, our students are required to develop a broad understanding in genetics 
and genomics technologies and computation. 


Wayne State University is a premier, public, urban research university located 
in the heart of Detroit where students from all backgrounds are offered a rich, high 
quality education. Our deep-rooted commitment to excellence, collaboration, 
integrity, diversity and inclusion creates exceptional educational opportunities 
preparing students for success in a modern global society. Wayne State 
University encourages applications from women, people of color and others 
underrepresented in medicine. Wayne State University is an affirmative action/ 
equal opportunity employer. Metropolitan Detroit, with a population of four 
million, encompasses many outstanding residential areas, fine public and private 
schools for K-12 education, myriad cultural and recreational opportunities, and 
proximity to other outstanding institutions of higher education. 


IT’S NOT 
JUST A JOB. 


A CALLING. 


Ox Find your next job at ScienceCareers.org 


Whether you’re looking to get ahead, get into, or just plain get 
advice about careers in science, there’s no better or more trusted 
authority. Get the scoop, stay in the loop with Science Careers. 


ScienceCareers 


FROM THE JOURNAL SCIENCE JPAYAAAS 


@@m Massachusetts 
i [ Institute of 
Technology 


FACULTY POSITIONS 
DEPARTMENT OF BIOLOGICAL ENGINEERING 


The MIT Department of Biological Engineering in Cambridge, MA invites applications for tenure-track faculty positions at the assistant professor level, to begin 
July 1, 2023 or on a mutually agreeable date thereafter. MIT Biological Engineering is a unique department that fuses molecular and cellular bioscience with 
quantitative engineering approaches and includes faculty and students with diverse scientific interests and backgrounds. Areas of high priority for growth include 
biotechnology beyond medicine, including applications in materials, chemicals, agriculture, sustainability, energy and the environment, as well as innovative 
new areas for the application of biotechnology. Applicants should hold a Ph.D. in a science or engineering discipline related to Biological Engineering by the 
beginning of employment and should have demonstrated experience in biological engineering research. A more senior faculty appointment may be considered in 
special cases. The department culture is vibrant, creative, and collaborative, and we strive to help each member of the community reach their greatest potential. 
Faculty duties include teaching at the graduate and undergraduate levels, conducting research, as well as supervision, training and mentoring students and junior 
researchers. Candidates should be capable of instructing in our core biological engineering educational curricula. 


Candidates must register with the BE search website at http://be-fac-search.mit.edu, and must submit application materials electronically to this website. 
Candidate applications should include a description of professional interests and goals in both teaching and research. Each application should include a 
curriculum vita and the names and addresses of three or more references who will provide recommendation letters. References should submit their letters directly 
at the http://be-fac-search.mit.edu website. 


Applications received by February 28, 2023 will be given priority. 


Questions may be directed to: Prof. Angela Belcher, Head, Department of Biological Engineering, MIT: 77 Massachusetts Avenue, 16-343, Cambridge, MA 
02139, bedh@mit.edu 


With MIT’s strong commitment to diversity in engineering and science education and research, we especially encourage those who will contribute to our diversity 
and outreach efforts to apply. 


MIT is an equal employment opportunity employer. All qualified applicants will receive consideration for employment and will not be discriminated 
against on the basis of race, skin color, gender identity, sexual orientation, religion, disability, age, genetic information, veteran status, ancestry, 
or national or ethnic origin. MIT ’s full policy on Nondiscrimination can be found at: 
https://policies.mit.edu/policies-procedures/90-relations-and-responsibilities-within-mit-community/92-nondiscrimination 


Pers THE UNIVERSITY OF 
¥ CHICAGO 


Instructional Professors in Neuroscience, Biochemistry and Computational Biology 


Acareer plan customized 
for you, by you. 


DESCRIPTION 

The Neuroscience Institute and the Biological Sciences Collegiate Division at the University of Chicago are 
accepting applications for two open rank Instructional Professor positions pending budgetary approval. 
These are full-time teaching positions beginning on or after January 1, 2023. The appointment will be for an 
initial term of at least two years with reappointment and progression possible following review. 


Responsibilities include teaching undergraduate neuroscience and/or biological science courses. Additional 
duties include, hiring, training, and supervising teaching assistants. Instructional Professors of all ranks are 
required to engage in regular professional development. 


There's only one Science 


QUALIFICATIONS 
Applicants must have a PhD in a relevant discipline in hand prior to start date and at least one year of teaching 
university level lectures and/or laboratory courses. We are particularly interested in applicants with expertise 


Features in myIDP include: 
= Exercises to help you examine your 


skills, interests, and values. 


= Alist of 20 scientific career paths 
with a prediction of which ones best 
fit your skills and interests. 


li~ Visit the website and start 
myDP planning today! 
YJ myIDP.sciencecareers.org 
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in neuroscience, biochemistry, and/or computational biology. 


APPLICATION INSTRUCTIONS 

Applicants must apply online at the University of Chicago’s Interfolio website at apply.interfolio. 
com/111651. The required application materials are: 1) cover letter; 2) a curriculum vitae; 3) a teaching 
statement; 4) a sample course syllabus; 5) course evaluations or evidence of past teaching performance; 
6) the names and contact information for three references 


Review of applications will begin on September 19, 2022 and will continue until the positions are filled or 
the search is closed. 


The positions will be part of the Service Employees International Union. 


We seek a diverse pool of applicants who wish to join an academic community that places the highest 
value on rigorous inquiry and encourages diverse perspectives, experiences, groups of individuals, and ideas 
to inform and stimulate intellectual challenge, engagement, and exchange. The University’s Statements on 
Diversity are at https://provost.uchicago.edu/statements-diversity. 


The University of Chicago is an Affirmative Action/Equal Opportunity/Disabled/Veterans Employer and does 
not discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity, national or 
ethnic origin, age, status as an individual with a disability, protected veteran status, genetic information, 


or other protected classes under the law. For additional information please see the University’s Notice of 


Nondiscrimination. 


Job seekers in need of a reasonable accommodation to complete the application process should call 773-834- 
3988 or email equalopportunity@uchicago.edu with their request. 


online @sciencecareers.org 
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WORKING LIFE 


By M. Shehryar Khan 


1014 


From dropout to Ph.D. 


was on a long-haul trucking trip, somewhere between Mississauga and Montreal, but my mind was 

elsewhere. I was thinking about my personal journey and wondering whether this was my future. 

I had dropped out of university, and now I was riding with a family friend who owned a trucking 

company to see whether this might be a viable career option for me. Trucking was more reliable 

and paid better than what I had been doing, juggling three part-time jobs in retail and customer 

service to make ends meet. I was helping support my mother and two younger sisters and needed 
a job I could count on. But I couldn’t shake the sense that I would be choosing this route with feelings 
of failure looming over my head—a path I knew would lead to unhappiness. 


When I was younger, I had been 
a straight-A student with aspira- 
tions to attend a top university. 
But things started to go downhill 
for me in middle school after the 
9/11 attacks perpetrated against 
the United States, where I was liv- 
ing as a recent immigrant from 
Pakistan. At school the following 
day, a teacher publicly insinuated 
that I was somehow responsible 
for the attacks. That evening, I was 
seriously injured during football 
practice. The next day, our family 
businesses were vandalized. The 
racism posed a clear threat to our 
safety and pushed us to move back 
to Pakistan. 

Then, my parents divorced. 
Torn between wanting to be with 
my mother, who moved to Canada 
alone with no support, or with my 
beloved aging grandfather in Paki- 
stan, I moved back and forth but became depressed and 
struggled with my academics. I barely graduated from high 
school and enrolled at the only university that offered me a 
spot, feeling that I had already failed. 

That mindset triggered a vicious cycle. I was too discour- 
aged to thrive in my studies, and my resulting struggles re- 
inforced my belief that I was failing. I began to suffer from 
severe depression, and the death of my grandfather was the 
final straw. With no real hope for the future, I dropped out 
and returned to my mother and sisters in Mississauga. 

That’s when I went on the trucking run. During those 
long hours on the road, my grandfather’s last words of ad- 
vice, in a letter he wrote when I started university, came 
back to me: “Please be brave (which you are), accept the 
challenges of life (which is never a bed of roses), work hard 
and harder, and you will see the blessings of Almighty God 
and our prayers are with you at all times. It is now or never. 


“My path suddenly became clear: 
| had to finish what | had 
started and go back to school.” 


Thope you will never disappoint us.” 
With his words ringing in my ears, I 
discovered a newfound resolve. My 
path suddenly became clear: I had 
to finish what I had started and go 
back to school. 

To get back on track, I needed to 
redo grade 12. I took a full course 
load while continuing to work 
enough part-time hours at Home 
Depot and Walmart to help pay the 
bills. After graduating from high 
school (again!), I began a univer- 
sity engineering co-op program, 
in which I would work as a paid 
intern every alternate term, allow- 
ing me to keep my student loans in 
check while continuing to help sup- 
port my family. 

At first, the fear of failure that 
had doomed my previous univer- 
sity endeavor continued to linger at 
the back of my mind. But I knew it 
would hold me back if I didn’t do something about it. So, 
I finally started counseling and therapy, which helped me 
realize that the outcome didn’t matter as much as know- 
ing I tried my best, and over time I learned to keep my fear 
of failure in check. After completing my bachelor’s degree, 
I went on to a master’s and now a Ph.D., winning several 
research awards along the way—something I scarcely could 
have imagined just a few years earlier. 

Grad school has been full of the challenges and setbacks 
that every student is well aware of, but my path to this 
point has made one thing clear: My fear of failure not only 
limited me, but kept me from achieving my goals. I finally 
internalized the words my grandfather wrote to me in that 
letter all those years ago. I hope I have made him proud. & 


M. Shehryar Khan is a Ph.D. candidate at the University of Waterloo. 
Send your career story to SciCareerEditor@aaas.org. 
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MICHELSON 
PHILANTHROFIES 


fae PRIZE FOR IMMUNOLOGY 


REWARDING HIGH-RISK RESEARCH. 
SUPPORTING EARLY-CAREER SCIENTISTS. 
HELPING 10 FIND CURES FASTER. 

APPLY TODAY 


ye 
Now accepting applications for the Michelson eee ‘= ' 
Philanthropies & Science Prize for Immunology. if 


Te ee 
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The Michelson Philanthropies and Science Prize for 
Immunology focuses on transformative research in 
human immunology, with trans-disease applications 
to accelerate vaccine and immunotherapeutic 


“The Michelson Philanthropies & Science 
Prize for Immunology will greatly impact 
my future work. As | am just starting my 
scientific career, it will illuminate my work, 


discovery. This international prize supports investigators 
35 and younger, who apply their expertise to research 
that has a lasting impact on vaccine development and 
immunotherapy. It is open to researchers from a wide 
range of disciplines including computer science, 
artificial intelligence/machine learning, protein 
engineering, nanotechnology, genomics, parasitology 
and tropical medicine, neurodegenerative diseases, 


and gene editing. 


Application deadline: Oct. 1, 2022. 


For more information visit: 


www.michelsonmedicalresearch.org 


#MichelsonPrizes 


spark interest and support me to continue 
my research in this field.” 


Paul Bastard, MD, PhD, 
Laboratory of Human Genetics 
of Infectious Diseases, Imagine 
Institute (INSERM, University of 
Paris), Paris, France; and The 
Rockefeller University, New York. 


Dr. Bastard received the inaugural 
Grand Prize for his essay: “Why 
do people die from COVID-19: 
Autoantibodies neutralizing type 

| interferons increase with age.” 


GRAND PRIZE: 
$30,000 


FINALIST PRIZE: 
$10,000 


mMRNA-LNP Products & Services 


For Vaccine and Immunotherapy Development 


Antibody Engineering 


Antibody Services mRNA-LNP Services Immune Cell Services 
e Monoclonal Ab Development e mRNA Design and Synthesis (IVT) ¢ Lentivirus Production GMP 
e Hybridoma Sequencing e MRNA Capping & Purification e CAR-T/NK Cell Generation 
e Human Antibody Screening e mRNA-LNP Encapsulation © CAR-T/NK Animal Model 
e Antibody Humanization e mRNA-LNP Scale Up and QC e Immune Cell Engineering 
e Bispecific Antibody Engineering e mRNA-LNP Validation In Vivo e T/B/NK/Dendritic cell, Macrophages 
e Stable Cell Line Generation e lonizable Lipid Licensing ° TCR knockout by CRISPR 


All products are for research use only 


Discover more | www.promab.com 


OK © 2600 Hilltop Dr, Building B, Suite C320, Richmond, CA 94806 
ProMab 1.866.339.0871 | & info@promab.com 


Biotechnologies, Inc. 


